Over the past year-and-a-half, Louisville Metro government has been making news and winning awards for an innovative approach toward handling the city’s data: making it public for anyone who’s interested.
But simply making data available is only part of the solution; it can give the wrong impression or sometimes be counter to the facts. And citizens have to be able to interpret it, too.
Louisville Mayor Greg Fischer is a huge proponent of the city’s Open Data Portal, an online destination for thousands of data points ranging from where crimes occur to what Metro employees are paid.
“Well, Louisville’s one of the leaders in the country with open data and it’s been important for me that all of our data goes online,” Fischer told WFPL recently. “It’s the people’s data, it should be open.”
Citizens can access the nearly 200 data sets on the portal, most of which are fed by Metro government data. Some of that information is housed in a server room at 410 S. 5th Street, next door to the city’s Information Technology office. From there, it can be remotely retrieved and uploaded onto the portal, said IT director Chris Seidt.
Much of that data is made available in a “raw” format, meaning it’s presented in a spreadsheet format that can be sorted by humans and computers alike. But although it may be “the people’s data,” it may not be shared in a way that the average citizen can easily understand.
Consider crime data: it makes up nearly half of the 21,000 downloads of city data since late 2016, when the new version of the portal went live, according to Google Analytics figures available on the portal. The data sets are updated automatically every 24 hours, but sometimes include duplicates or incidents from past years.
Louisville Metro Police Department spokeswoman Jessie Halladay said the daily data reports are “live,” which means they haven’t been combed for anomalies. Experts say that’s expected in large data sets. LMPD has systems in place to clean up the data before leaders receive their official reports every Monday.
“It’s raw data, which is useful, but isn’t necessarily 100 percent cleaned,” Halladay said.
I found that out firsthand recently, when the homicide rates I pulled from the Open Data Portal didn’t match those provided by the police department. The 2017 file I was looking at needed to be trimmed — by more than 15,000 line items — to get rid of duplicates and incidents that took place in previous years.
Maybe I should have known to look more closely at each incident number. And Halladay said LMPD could also provide a better description of the state of its data on the portal.
But if I, a journalist with a working understanding of open data, could run into this problem, what did that mean for average citizens — like people who may be grabbing data sets off the site to check neighborhood crime?
‘Overwhelming’ For Novices
Ask Tina Maddox, a longtime homeschooling mother who recently earned a degree in software development and now teaches Android development to high school students.
“For someone coming in, it’s probably going to be overwhelming,” she said. “I find it overwhelming … It’s a lot to take in, and its a lot to understand.”
Maddox uses data from the city’s portal to create flash briefings for Amazon Echo, so device users can ask Alexa for updates on what’s going in Louisville’s parks and with other city services.
She does that work as a volunteer with the Civic Data Alliance, a Louisville-based volunteer group that helps citizens get what they need from government data.
Governments should release data raw and in spreadsheets, said Ted Smith, CEO of technology company Revon Systems and former Chief of Civic Innovation for Louisville Metro. That serves two purposes: it allows machines to read the data, and also makes sure the data is free from government interpretation or manipulation.
But it also can make the data “overwhelming,” as Maddox said, for novices.
‘Access is not going to be equal’
Some say expert groups such as the Civic Data Alliance are necessary, since most people can’t figure out the data on their own.
Stephen Larrick, Open Cities Director of the Sunlight Foundation, a national non-partisan non-profit that supports open government efforts, said raw data on its own may not be truly accessible to everyone.
“If you just are sharing the raw data, even if that raw data is the people’s raw data, access is not going to be equal,” he said.
There’s no city doing a perfect job presenting its data so that anyone can use it, Larrick said. It takes specific skills to use raw data. But visualizations, which are easier to digest, only let people access information in predetermined ways.
Groups and companies that take the raw data and translate it can make it useful to the public, said Louisville Data Officer Michael Schnuerle. Take, for example, the review site Yelp, which pulls city health inspection data for its restaurant listings.
But Schnuerle acknowledged there’s a basic level of data literacy needed to work with the portal’s raw feeds. He said he’s working to get more visualizations and tools on the site to help regular people answer specific questions.
“Ideally, if you could develop a system that could automatically look at raw data and spit out the answer that people wanted, that would be very beneficial,” he said. “There’s no other open data portal in the world that has that functionality right now.”
The first phase of the open data movement was getting governments to release their information online, freely and regularly, to the public, said the Sunlight Foundation’s Larrick. Now that practice is becoming commonplace, so the movement is approaching a new focus: making data accessible and comprehensible for everyone.