SlideShare une entreprise Scribd logo
1  sur  5
Télécharger pour lire hors ligne
Developing a data pipeline to improve accessibility
and utilization of Charlottesville’s Open Data Portal
Lucas Beane, Elena Gillis, Rafael Alvarado, Caitlin Wylie
University of Virginia, lhb7tz, emg3sc, rca2t, cdw9y@virginia.edu
Abstract – To improve democratic engagement between
the people and the government, the city of Charlottesville
put forward a proposition to construct an online portal
that would contain data from the city departments that is
considered public by nature. This move was intended to
promote the ease of access to data pertinent to ongoing
policy debates in the city and incentivize the public to
contribute to the policy-making process with informed
participation. Such efforts, while successful at their start,
have gradually stagnated, and the end objective of the
portal has not been reached. In this paper we identify
possible reasons for this stagnation – inconsistent
formatting of the datasets, variables that are not meant
for human legibility, and limited data with
disproportional representation from the city departments.
We then propose a data pipeline that serves as a tool to
extract utility from the data. It does so by converting the
datasets into a consistent format, merges the datasets, and
allows for creation of simple visualizations. The pipeline
acts as a link between the raw data published by the
government units and the city by increasing its
interpretability and legibility and outputting results that
are easily relatable to the policy issues at hand. We
demonstrate this by analyzing datasets for crime and real
estate and relating our findings to the affordable housing
debate.
Index Terms – open data, Charlottesville, informed policy-
making, data visualization, data pipeline.
INTRODUCTION
Civic data portals have become a common feature of local
governance across the globe, and in response to requests from
the community, the City of Charlottesville made a collective
decision to disclose city department data to the public.
Launching the Charlottesville Open Data Portal was a socially
and strategically important action meant to democratize the
policy-making processes in the city by enabling informed
public participation. Despite this collective effort, the portal
remains largely unutilized by the public. We primarily
attribute this trend to the fragmented nature of the datasets,
inconsistency in variable conventions, and limits of available
information. In our project, we aim to increase usability and
utilization of disclosed data by comprehensively combining
existing datasets to draw additional insights, make them easily
accessible to the public through visualizations on an
interactive dashboard, and demonstrate how such information
can be used to inform policy decisions. To emphasize these
points, we construct a data pipeline that combines data from
the portal into an incorporated dataset that takes all these goals
into account. In addition, by looking at the portal through a
spectrum of user perspectives, ranging from data experts to
non-experts, we develop a series of actionable suggestions
with the goal of improving the overall structure and use of the
portal.
We created a pipeline for dealing with similar data types
and merging individual datasets. We developed a prototypical
model for combining different types of geospatial data, which
ultimately outputs a result in the form of a census block (the
most granular measure attainable with our data) for each entry
with a geospatial location. We aim to expand this sort of
model to other data formats for easier comparison between
datasets. Additionally, we kept track of any hurdles we
encountered that might discourage a non-data scientist from
working with data on the portal. Although open-access tools
exist to interact with the portal’s datasets, they aren’t always
as reliable or powerful as users would like. In addition,
directly downloading the data (to use with third party
programs) is not always an option due to inconsistent
formatting in data and astronomically sized datasets. By
noting these hurdles, we will forward our suggested solutions
to the city staff who maintain the portal, with the goal of
actionable policy revision.
Our pipeline ensures that the data is cleaned and in a
consistent format, allowing us to visualize each dataset
individually and overlay the results to observe any patterns in
the data. We then relate the findings to ongoing policy debates
pertaining to the city’s development. The ultimate goal of this
project is to show the city of Charlottesville how the Open
Data Portal can create a positive impact and incentivize
greater citizen participation in the policy-making process.
BACKGROUND
I. Charlottesville’s Portal
In late 2016, after months of solicitation by the city’s tech
sector, Charlottesville City Council passed a resolution to
begin and support an open data initiative for the city. The
main goals of the project were to let the public have an eye on
what was happening data-wise in the city and, in doing so, to
enable citizens to have more of a say in their local government
through collaborative efforts.
To set up the portal, seven city employees were assigned
to the newly formed Open Data Committee as architects and
developers of the portal with five local citizens acting as a
review committee deemed the Open Data Advisory Group.
The ODC collected datasets from various public institutions
around Charlottesville while performing pre-processing (e.g.
anonymizing data, removing violent crime data and active
investigations, and correcting obvious errors) so that having a
public eye on the data would be unproblematic. Though the
city maintains that the purpose of the portal is to get citizens
more engaged in local government, many of their efforts to
make the portal more accessible have been relatively
unsuccessful. When asked if the ODC planned to teach the
public how to use the portal, one member stated that “it’s not
for novices” and pointed technological hopefuls in the
direction of D.C.’s open data portal for training [1].
Unfortunately, once the data has been published to the portal,
it was not seen by the city as their responsibility to teach the
public how to utilize it to its potential, which lead to the
current predicament.
In August of 2017, the Charlottesville Open Data Portal
had its soft launch, which was eclipsed by the tragedy of a
violent white nationalist rally a few days prior. We partially
explain the lack of public awareness of the portal through the
domination of this event in local attention and the news cycle,
but we also acknowledge the difficulty in promoting a digital
platform to a public audience.
II. Other Cities’ Portals
Other cities in Virginia have implemented open data portals,
and, through using them, some have had tremendous
successes with implementing positive policy change. For
example, Lynchburg utilized direct citizen input to help with
their “Poverty to Progress” plan, a goal of which was to map
food deserts in order to inform non-profits which areas need
the most help [2]. In an endeavor to improve transparency,
Virginia Beach created web apps to get citizens’ direct input
on budget proposals as part of their Information Technology
department’s 10 strategic goals [2]. These two cities garnered
national attention when they were among the first-place
winners of the Center for Digital Government’s 2017 Digital
Cities Survey for their efforts [2]. Cities across the U.S. were
recognized in this competition, and they all had one thing in
common: they had specific problems they wanted to address
with their data portal. Though some agencies in
Charlottesville use the data portal in such a way, we find no
evidence that the city of Charlottesville invites the public to
work with them on these specific uses, such as the cities of
Lynchburg and Virginia Beach do. We believe this to be part
of the explanation behind the portal’s under-utilization.
On a different scale but with similar problems, the
national open data portal had a similar start: “In May, 2009,
with just 47 datasets. It was not an instant hit” [3]. That being
said, the national data portal now has over 230,000 datasets,
an indicator that its utility is widely recognized. This narrative
is inspiring for Charlottesville’s portal, as it provides hope for
an achievable goal while still allowing for a period of
stagnation.
PROBLEM DESCRIPTION
Democracy and transparency in policy-making were
identified by the creators as the main objectives of the portal,
however, as of now, the portal is not being utilized to its full
potential. In this paper we aim to outline the stages of building
a portal and identify the main reasons pertaining to this issue.
We then attempt to propose possible solutions that would
tackle these hurdles.
We identified three main stages of portal development:
collection, distribution, and presentation of data. The
collection of data is performed by the Open Data Committee
(ODC) that consists of representatives from the city staff. The
ODC works with the heads of city departments to obtain the
data. However, this is complicated because many view
sharing their data as a personal liability as the data is
frequently sensitive and there are no apparent immediate
benefits to its disclosure. Staff in the city’s tech departments
clean and publish the collected data using Esri’s ArcGIS
platform. Since the data is collected from different sources
that utilize various collection methods and use the data for
different purposes, the published datasets are incompatibly
structured and often cannot be combined to derive additional
information.
Presentation of the data is primarily done by so-called
“super users” of the portal – people with strong technical
backgrounds who use the published data to create
visualizations and draw additional insights. This is where our
project comes in—we aim to make the published data more
comprehensible and accessible as well as create a blueprint
for working with new data as it is published on the portal.
Considering the immense scope of this problem, we focus on
two primary objectives: identifying the main reasons behind
why the portal is not being widely utilized and tackling these
issues by discovering the hurdles that may hinder users. To
increase usability and interpretability of the data, we develop
a data pipeline to demonstrate utilization of the published data
to produce actionable information to inform policy decisions.
Our work can demonstrate to the city (staff, users and
municipal departments) how disclosing public data can create
positive impact, thereby incentivizing more government units
to share data.
DATA
Currently there are 81 datasets published on the portal that are
divided into ten categories: Property and Land, Economy,
City Operations, Public Safety, Demographics, Getting
Around, Recreation, Infrastructure, Environment, GIS Base
Layers. There are over three million observations across all
the datasets with over three hundred variables, with about
75% of this data containing geospatial information. Although
the datasets are largely geospatial, this information is
presented in different formats, e.g., coordinates, parcel
numbers, census tracts and block numbers, and physical
addresses. This generates a layer of complication when
merging the datasets, as the geographical measures do not
always overlap. In addition, the same variables in different
datasets have different interpretations, depending on the
source and purpose of the data, and in the absence of a data
dictionary they are hard to distinguish and define. The
opposite is also true – some variables containing similar and
integrable information had inconsistent names and, as a result,
were difficult to identify and overlay.
APPROACH
I. Data Discovery
At his stage data collection and maintenance of the portal are
no one’s official responsibility and is performed purely on a
volunteer basis, as was revealed through a meeting with a
representative of the Open Data Advisory Group. Given the
lack of any imminent positive impact from the portal’s initial
opening, the enthusiasm for open data in Charlottesville has
substantially diminished and at times the efforts to improve
the portal are put completely on hold, which hinders any
further attempts at expanding what the portal can accomplish.
Most of the portal’s geospatial data is designed for easy
integration with GIS, which was pointed out by UVA’s
Scholars Lab, but that characteristic is not apparent and
requires advanced technical skills that are not common among
the general public. Nonetheless, viewing the data as
integrable with GIS helps define some of the variables and
expand the amount of utilizable information in the dataset.
The design of the datasets and lack of interpretability of
the variables, as well as disproportional representation of data
across departments is explained by the fact that most of the
published data was already collected and available in the city
departments and, as such, formatted to the departments’ initial
needs. No new data is recorded and formatted specifically for
publication on the portal and some departments that had
initially committed to sharing their data started to worry about
the risks of sharing it without knowing who will use the
disclosed information and to what end. In addition,
publication of the data is seen by the city as the final objective,
leaving it to the public to derive their own utility.
II. Data Flow
Figure 1 below shows a flowchart of the three stages of the
portal development. The first two stages - collection and
distribution of the data - are performed by the Open Data
Committee and the city staff and thus are outside the scope of
this project. Instead, we offer additional steps that process the
given data and output user-friendly visualizations that could
increase the usability of the portal and contribute to its initial
goal of a democratized and informed policy-making process.
PIPELINE
Our main goal was to develop an architecture for a data
pipeline, to transform data currently on the portal to
accessible information. We aim to create a model that could
be implemented by anyone with some language-agnostic
technological knowledge, with the ultimate output of a tool
that could be utilized by anyone. Along the way, we also take
note of any hurdles we encounter that might discourage other
users from working toward a similar goal.
To begin, we utilize a simple web scraper using the
Python package Selenium that grabs the dataframes from the
portal and stores them locally as a series of files in CSV
format.
We then clean the data using a variety of methods,
primarily using typical packages in Python. The most relevant
approach we take is to assign a confidence level from 1 to 4
to each feature from every dataframe to represent how
confident we are about the entries’ meanings. We then only
keep features with confidence levels of 3 or 4 (the levels
representing the most confidence), discarding the rest. This
approach is a direct result of the problem of lack of
documentation on the portal; a more comprehensive data
dictionary would eliminate the need for this entirely. At this
point, we also make sure to rename our features according to
their perceived meaning, both consolidating similar features
(e.g., changing geospatial data named “X” and “Y” to
“Longitude” and “Latitude”) and separating features with
deceptively similar names (e.g., “BlockNumber” to “FIPS
Block Code” and “Street Number”). We choose to further
amalgamate our geospatial data into a single format—the
census block number—to make the visualization step simpler,
acknowledging that this sacrifices precision in favor of
aggregation.
With the data cleaned, we use the pandas package in
Python to merge similar dataframes on their shared features
(e.g. “Longitude” and “Latitude”), only keeping desired
features in the resulting merge. We primarily work with the
geospatial data at this step, but this could easily be applied to
any type of feature.
At this point, any desired analysis could be performed,
and due to the nature of our project, we do so in the form of
visualizations. We choose this medium because our work up
to this point is with geospatial data, which would be intuitive
to analyze visually. We also want to use the visuals as an
intuitive way of working toward our ultimate goal of
increasing accessibility and use of the portal.
FIGURE 1
THREE STAGES OF PORTAL DEVELOPMENT
RESULTS
I. Hurdles to a Portal User
While working on the data pipeline, we took note of multiple
obstacles in utilizing the data. Firstly, there is an uneven
representation of data from different city departments. We
attribute this lack of proportionality to three causes: city
staff’s persisting reluctance to share their data, obscurity in
the existing data that diminishes its utility, and disparate
practices of data collection across city departments. This
results in inconsistency of data formatting, since, as of now,
there are no guidelines for generating the data submitted to
the portal. Data submission is instead performed on a
voluntary basis and any data is seen as good data as long as it
is disclosed to the public. This practice makes it difficult for
both humans and machines to read the data, especially with
the lack of comprehensive data dictionaries for the published
datasets.
There also exist multiple technical issues with the portal.
At times some datasets are unavailable for download. While
some became open after a period of time, a few still remain
inaccessible. Among the datasets that we were able to scrape,
some are too large to be easily read by home computers and
standard software. In cases when these datasets are readable,
it is still difficult to identify significant observations and
variables. We believe that the ability to do so is important
given the main objective of the portal.
II. Examples and Visualizations
Our data pipeline utilizes the raw data published on the portal
and creates a baseline for informed policy debates. It
standardizes geospatial variables for ease of dataset
integration and mergeability then outputs visualizations that
provide insightful summaries for public users of the portal. To
demonstrate this capability, we use real estate and crime rate
data to produce relevant visualizations and show how these
results can be used in the ongoing policy debate for affordable
housing in Charlottesville.
In a visualization of total real estate value in
Charlottesville since 1997, one can immediately glean a few
insights. For example, as the recession hit the city in 2009, the
sum value of properties in Charlottesville took a hit. Only
recently did the city gain back its sum momentum, but it
seems to have accomplished this well based on the trend
before 2009. In addition, some of the highest-valued
properties in Charlottesville are labeled along the right, and
one can easily see that beside the hospitals, park, and
shopping center, the top-valued properties in Charlottesville
are apartment complexes that have sprung up since 2011.
Together, these conclusions could potentially lead a user
toward a hypothesis that companies took advantage of lower
cost property values during the recession to buy, develop, and
market upper end housing, further fueling the gentrification
process in Charlottesville.
Visualizing crime data alongside the real estate data helps
point out trends in distribution of crime and the geospatial
pockets in which certain types of crime is prevalent. Although
crime dataset is one of the most comprehensible and well
formatted datasets in the portal we still see inherent
misrepresentation of values. For instance, geospatially we can
observe that many data points are concentrated around the
police department in downtown Charlottesville. We believe
that such concentration can be caused by two reasons – the
police station is recorded as the location of the violation for
the crimes that are charged at the station, or the station address
is used as a default in the absence of another address. As a
result, in our geospatial analysis of crime data we exclude the
violations that have the police station as their location of
record to give more weight to the observations that occurred
at other locations.
The data points overlaid on the physical map of
Charlottesville illustrate density of crimes across the city as
well as show pockets of concentration for specific types of
violations. Such data, plotted over time, can be of significance
in analyzing effects in implementation of new policies that
aim to reduce crime. In the area diagram plotted across the
time of day, we can see that Towed Vehicles is the most
prevalent type of crime recorded in the afternoons, while
Simple Assault is more common before 10 a.m. In addition,
we see a sudden drop of crime-recording practices during
lunch hours. We believe that either fewer offenses were
penalized during that time or they were not logged until later
in the afternoon. In a heatmap for 20 most common crimes
logged in Charlottesville over the span of the day, we can see
similar patterns (Appendix 2). Crime pattern analysis over the
months of the year identifies the prevalence of crimes such as
Towed Vehicles and Simple Assault in the months of
September and October, which could be explained by UVA
students returning to the campus and the beginning of the
football game season. By observing monthly patterns, the city
can develop strategies to respond to the violations
accordingly.
CONCLUSION
We successfully built a data pipeline that takes the data from
the portal as input and outputs actionable information in an
accessible way. We created a few sample visualizations to
showcase what this sort of process can achieve. We’ve also
outlined our process such that anyone who wants to either
replicate or make any adjustments to our process can do so in
any desired language.
We encountered myriad problems that could potentially
explain the dearth of users of the portal. Chief among them is
the lack of interpretability of the data; though the datasets
have descriptions, they lack extensive data dictionaries, so
many of the features therein are unusable. This is likely due
to the data collection policy, which has few guidelines to
allow for consistency across datasets. Additionally, we found
a disproportionate number of representative datasets for some
departments, perhaps because they were willing and able to
offer up more data. If a user isn’t interested in the over-
represented departments, this trend could disenfranchise them
simply because the available data do not form a reliable
picture of Charlottesville. There also still exist simple
problems of accessibility; at times, some datasets on the portal
are not available. On top of that, several datasets are simply
too large for the average home computer to utilize.
Addressing these problems would be an important first step
toward achieving a wider user base for Charlottesville’s open
data portal.
REFERENCES
[1] Wylie, C., Neeley, K., Ferguson, S. 2019. “Beyond Technological
Literacy: Open Data as Active Democratic Engagement?”, accepted for
publication in Digital Culture and Society, January 2019.
[2] Erepublicnews.com. 2019. Digital Cities Winners Leverage
Technology to Enhance Inclusion, Solve Social Challenges.
http://erepublicnews.com/digital-cities-winners-leverage-technology-
to-enhance-inclusion-solve-social-challenges.
[3] Forbes.com. 2019. States Offer Information Resources: 50+ Open Data
Portals. https://www.forbes.com/sites/metabrown/2018/04/30/us-
states-offer-information-resources-50-open-data-
portals/#18f5850c5225.
AUTHOR INFORMATION
Lucas Beane, MSDS Student, Data Science Institute,
University of Virginia.
Elena Gillis, MSDS Student, Data Science Institute,
University of Virginia.
Rafael Alvarado, Assistant Director, Data Science Institute,
University of Virginia.
Caitlin Wylie, Assistant Professor, School of Engineering
and Applied Science, University of Virginia.
APPENDICES
Appendix 1
Appendix 2

Contenu connexe

Tendances

The Edinburg Housing Authority (EHA), A Research Study on Transparency
The Edinburg Housing Authority (EHA), A Research Study on TransparencyThe Edinburg Housing Authority (EHA), A Research Study on Transparency
The Edinburg Housing Authority (EHA), A Research Study on TransparencyThe University of Texas (UTRGV)
 
Ligado nos Políticos at ESWC'2011 Workshop
Ligado nos Políticos at ESWC'2011 WorkshopLigado nos Políticos at ESWC'2011 Workshop
Ligado nos Políticos at ESWC'2011 WorkshopPablo Mendes
 
Sex Disaggregation of Social Media Posts - Tool Overview
Sex Disaggregation of Social Media Posts - Tool OverviewSex Disaggregation of Social Media Posts - Tool Overview
Sex Disaggregation of Social Media Posts - Tool OverviewUN Global Pulse
 
Proposal for Designing a Linked Data Migrational Framework for Singapore Gove...
Proposal for Designing a Linked Data Migrational Framework for Singapore Gove...Proposal for Designing a Linked Data Migrational Framework for Singapore Gove...
Proposal for Designing a Linked Data Migrational Framework for Singapore Gove...Aravind Sesagiri Raamkumar
 
Turning Weaknesses into Opportunities Case of the Civil Registry Agency
Turning Weaknesses into Opportunities Case of the Civil Registry Agency Turning Weaknesses into Opportunities Case of the Civil Registry Agency
Turning Weaknesses into Opportunities Case of the Civil Registry Agency E-Government Center Moldova
 
Ottenere e visualizzare i dati. Open Data e Big Data
Ottenere e visualizzare i dati. Open Data e Big DataOttenere e visualizzare i dati. Open Data e Big Data
Ottenere e visualizzare i dati. Open Data e Big DataVincenzo Patruno
 
오픈 데이터 현황과 과제
오픈 데이터 현황과 과제오픈 데이터 현황과 과제
오픈 데이터 현황과 과제Haklae Kim
 
OneBusAway - An open-source platform for Mobility as a Service
OneBusAway - An open-source platform for Mobility as a ServiceOneBusAway - An open-source platform for Mobility as a Service
OneBusAway - An open-source platform for Mobility as a ServiceSean Barbeau
 
Oregon State visit 2011
Oregon State visit 2011Oregon State visit 2011
Oregon State visit 2011Diane Hillmann
 
Open Linked Data as Part of a Government Enterprise Architecture
Open Linked Data as Part of a Government Enterprise ArchitectureOpen Linked Data as Part of a Government Enterprise Architecture
Open Linked Data as Part of a Government Enterprise ArchitectureJohann Höchtl
 

Tendances (14)

The Edinburg Housing Authority (EHA), A Research Study on Transparency
The Edinburg Housing Authority (EHA), A Research Study on TransparencyThe Edinburg Housing Authority (EHA), A Research Study on Transparency
The Edinburg Housing Authority (EHA), A Research Study on Transparency
 
Ligado nos Políticos at ESWC'2011 Workshop
Ligado nos Políticos at ESWC'2011 WorkshopLigado nos Políticos at ESWC'2011 Workshop
Ligado nos Políticos at ESWC'2011 Workshop
 
Study on Open Government: A view from local community and university based r...
Study on Open Government:  A view from local community and university based r...Study on Open Government:  A view from local community and university based r...
Study on Open Government: A view from local community and university based r...
 
Linked data migrational framework
Linked data migrational frameworkLinked data migrational framework
Linked data migrational framework
 
Sex Disaggregation of Social Media Posts - Tool Overview
Sex Disaggregation of Social Media Posts - Tool OverviewSex Disaggregation of Social Media Posts - Tool Overview
Sex Disaggregation of Social Media Posts - Tool Overview
 
Wa final
Wa finalWa final
Wa final
 
Proposal for Designing a Linked Data Migrational Framework for Singapore Gove...
Proposal for Designing a Linked Data Migrational Framework for Singapore Gove...Proposal for Designing a Linked Data Migrational Framework for Singapore Gove...
Proposal for Designing a Linked Data Migrational Framework for Singapore Gove...
 
Open Data Trentino presented at the European Commission (JRC)
Open Data Trentino presented at the European Commission (JRC)Open Data Trentino presented at the European Commission (JRC)
Open Data Trentino presented at the European Commission (JRC)
 
Turning Weaknesses into Opportunities Case of the Civil Registry Agency
Turning Weaknesses into Opportunities Case of the Civil Registry Agency Turning Weaknesses into Opportunities Case of the Civil Registry Agency
Turning Weaknesses into Opportunities Case of the Civil Registry Agency
 
Ottenere e visualizzare i dati. Open Data e Big Data
Ottenere e visualizzare i dati. Open Data e Big DataOttenere e visualizzare i dati. Open Data e Big Data
Ottenere e visualizzare i dati. Open Data e Big Data
 
오픈 데이터 현황과 과제
오픈 데이터 현황과 과제오픈 데이터 현황과 과제
오픈 데이터 현황과 과제
 
OneBusAway - An open-source platform for Mobility as a Service
OneBusAway - An open-source platform for Mobility as a ServiceOneBusAway - An open-source platform for Mobility as a Service
OneBusAway - An open-source platform for Mobility as a Service
 
Oregon State visit 2011
Oregon State visit 2011Oregon State visit 2011
Oregon State visit 2011
 
Open Linked Data as Part of a Government Enterprise Architecture
Open Linked Data as Part of a Government Enterprise ArchitectureOpen Linked Data as Part of a Government Enterprise Architecture
Open Linked Data as Part of a Government Enterprise Architecture
 

Similaire à Connecting citizens with public data to drive policy change

IRMC Presentation Open Raleigh January 2013
IRMC Presentation Open Raleigh January 2013IRMC Presentation Open Raleigh January 2013
IRMC Presentation Open Raleigh January 2013Jason Hare (APM, GAIQ)
 
Syracuse open data presentation
Syracuse open data presentationSyracuse open data presentation
Syracuse open data presentationSam Edelstein
 
Research Poster-Exploring the Impact of Web Publishing Budgetary Information ...
Research Poster-Exploring the Impact of Web Publishing Budgetary Information ...Research Poster-Exploring the Impact of Web Publishing Budgetary Information ...
Research Poster-Exploring the Impact of Web Publishing Budgetary Information ...Open Data Research Network
 
Transportation: Creating value from data and committed leadership
Transportation: Creating value from data and committed leadershipTransportation: Creating value from data and committed leadership
Transportation: Creating value from data and committed leadershipThe Economist Media Businesses
 
ODDC Context - Opening the Cities: Open Government Data in Local Governments ...
ODDC Context - Opening the Cities: Open Government Data in Local Governments ...ODDC Context - Opening the Cities: Open Government Data in Local Governments ...
ODDC Context - Opening the Cities: Open Government Data in Local Governments ...Open Data Research Network
 
Smart cities presentation (4) cg (1)
Smart cities presentation (4) cg (1)Smart cities presentation (4) cg (1)
Smart cities presentation (4) cg (1)EdGaskin1
 
Final report syracuse open data portal
Final report syracuse open data portalFinal report syracuse open data portal
Final report syracuse open data portalSam Edelstein
 
towards an expanded and integrated ogd agenda for india // icegov 2013 // seoul
towards an expanded and integrated ogd agenda for india // icegov 2013 // seoultowards an expanded and integrated ogd agenda for india // icegov 2013 // seoul
towards an expanded and integrated ogd agenda for india // icegov 2013 // seoulSumandro C
 
City of Edmonton Open Data Strategy
City of Edmonton Open Data StrategyCity of Edmonton Open Data Strategy
City of Edmonton Open Data StrategyWendy Gnenz
 
Open Data Value Framework: Open Data's Four Pillars of Value
Open Data Value Framework: Open Data's Four Pillars of ValueOpen Data Value Framework: Open Data's Four Pillars of Value
Open Data Value Framework: Open Data's Four Pillars of ValueSocrata
 
Better Community Connections Through Big Data and Analytics
Better Community Connections Through Big Data and AnalyticsBetter Community Connections Through Big Data and Analytics
Better Community Connections Through Big Data and AnalyticsSAP Analytics
 
A Tale of Open Data Innovations in Five Smart Cities
A Tale of Open Data Innovations in Five Smart CitiesA Tale of Open Data Innovations in Five Smart Cities
A Tale of Open Data Innovations in Five Smart CitiesAdegboyega Ojo
 
Data Con LA 2018 Keynote - How city data sparks community change by Sari-Ladi...
Data Con LA 2018 Keynote - How city data sparks community change by Sari-Ladi...Data Con LA 2018 Keynote - How city data sparks community change by Sari-Ladi...
Data Con LA 2018 Keynote - How city data sparks community change by Sari-Ladi...Data Con LA
 
Open Data - Intro & Current State for Planners
Open Data - Intro & Current State for PlannersOpen Data - Intro & Current State for Planners
Open Data - Intro & Current State for PlannersJury Konga
 
Know Your Community: Data Power to the People
Know Your Community: Data Power to the PeopleKnow Your Community: Data Power to the People
Know Your Community: Data Power to the PeopleData Con LA
 

Similaire à Connecting citizens with public data to drive policy change (20)

What Works Cities - Findings for Vilnius
What Works Cities - Findings for VilniusWhat Works Cities - Findings for Vilnius
What Works Cities - Findings for Vilnius
 
IRMC Presentation Open Raleigh January 2013
IRMC Presentation Open Raleigh January 2013IRMC Presentation Open Raleigh January 2013
IRMC Presentation Open Raleigh January 2013
 
Syracuse open data presentation
Syracuse open data presentationSyracuse open data presentation
Syracuse open data presentation
 
CTDC Ecosystem Mapping Guide
CTDC Ecosystem Mapping Guide  CTDC Ecosystem Mapping Guide
CTDC Ecosystem Mapping Guide
 
Research Poster-Exploring the Impact of Web Publishing Budgetary Information ...
Research Poster-Exploring the Impact of Web Publishing Budgetary Information ...Research Poster-Exploring the Impact of Web Publishing Budgetary Information ...
Research Poster-Exploring the Impact of Web Publishing Budgetary Information ...
 
Transportation: Creating value from data and committed leadership
Transportation: Creating value from data and committed leadershipTransportation: Creating value from data and committed leadership
Transportation: Creating value from data and committed leadership
 
ODDC Context - Opening the Cities: Open Government Data in Local Governments ...
ODDC Context - Opening the Cities: Open Government Data in Local Governments ...ODDC Context - Opening the Cities: Open Government Data in Local Governments ...
ODDC Context - Opening the Cities: Open Government Data in Local Governments ...
 
Smart cities presentation (4) cg (1)
Smart cities presentation (4) cg (1)Smart cities presentation (4) cg (1)
Smart cities presentation (4) cg (1)
 
517 final report
517 final report517 final report
517 final report
 
Final report syracuse open data portal
Final report syracuse open data portalFinal report syracuse open data portal
Final report syracuse open data portal
 
towards an expanded and integrated ogd agenda for india // icegov 2013 // seoul
towards an expanded and integrated ogd agenda for india // icegov 2013 // seoultowards an expanded and integrated ogd agenda for india // icegov 2013 // seoul
towards an expanded and integrated ogd agenda for india // icegov 2013 // seoul
 
City of Edmonton Open Data Strategy
City of Edmonton Open Data StrategyCity of Edmonton Open Data Strategy
City of Edmonton Open Data Strategy
 
Open Data Value Framework: Open Data's Four Pillars of Value
Open Data Value Framework: Open Data's Four Pillars of ValueOpen Data Value Framework: Open Data's Four Pillars of Value
Open Data Value Framework: Open Data's Four Pillars of Value
 
CTDC DC Case Study
CTDC DC Case StudyCTDC DC Case Study
CTDC DC Case Study
 
01 boston cs_final_update
01 boston cs_final_update01 boston cs_final_update
01 boston cs_final_update
 
Better Community Connections Through Big Data and Analytics
Better Community Connections Through Big Data and AnalyticsBetter Community Connections Through Big Data and Analytics
Better Community Connections Through Big Data and Analytics
 
A Tale of Open Data Innovations in Five Smart Cities
A Tale of Open Data Innovations in Five Smart CitiesA Tale of Open Data Innovations in Five Smart Cities
A Tale of Open Data Innovations in Five Smart Cities
 
Data Con LA 2018 Keynote - How city data sparks community change by Sari-Ladi...
Data Con LA 2018 Keynote - How city data sparks community change by Sari-Ladi...Data Con LA 2018 Keynote - How city data sparks community change by Sari-Ladi...
Data Con LA 2018 Keynote - How city data sparks community change by Sari-Ladi...
 
Open Data - Intro & Current State for Planners
Open Data - Intro & Current State for PlannersOpen Data - Intro & Current State for Planners
Open Data - Intro & Current State for Planners
 
Know Your Community: Data Power to the People
Know Your Community: Data Power to the PeopleKnow Your Community: Data Power to the People
Know Your Community: Data Power to the People
 

Plus de Melissa Moody

Data Collection Methods for Building a Free Response Training Simulation
Data Collection Methods for Building a Free Response Training SimulationData Collection Methods for Building a Free Response Training Simulation
Data Collection Methods for Building a Free Response Training SimulationMelissa Moody
 
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...Melissa Moody
 
Deep Learning Meets Biology: How Does a Protein Helix Know Where to Start and...
Deep Learning Meets Biology: How Does a Protein Helix Know Where to Start and...Deep Learning Meets Biology: How Does a Protein Helix Know Where to Start and...
Deep Learning Meets Biology: How Does a Protein Helix Know Where to Start and...Melissa Moody
 
Automatic detection of online abuse and analysis of problematic users in wiki...
Automatic detection of online abuse and analysis of problematic users in wiki...Automatic detection of online abuse and analysis of problematic users in wiki...
Automatic detection of online abuse and analysis of problematic users in wiki...Melissa Moody
 
Plans for the University of Virginia School of Data Science
Plans for the University of Virginia School of Data SciencePlans for the University of Virginia School of Data Science
Plans for the University of Virginia School of Data ScienceMelissa Moody
 
Wikimedia Foundation, Trust & Safety: Cyber Harassment Classification and Pre...
Wikimedia Foundation, Trust & Safety: Cyber Harassment Classification and Pre...Wikimedia Foundation, Trust & Safety: Cyber Harassment Classification and Pre...
Wikimedia Foundation, Trust & Safety: Cyber Harassment Classification and Pre...Melissa Moody
 
Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in De...
Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in De...Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in De...
Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in De...Melissa Moody
 
Collective Biographies of Women: A Deep Learning Approach to Paragraph Annota...
Collective Biographies of Women: A Deep Learning Approach to Paragraph Annota...Collective Biographies of Women: A Deep Learning Approach to Paragraph Annota...
Collective Biographies of Women: A Deep Learning Approach to Paragraph Annota...Melissa Moody
 
Introduction to Social Network Analysis
Introduction to Social Network AnalysisIntroduction to Social Network Analysis
Introduction to Social Network AnalysisMelissa Moody
 
Ethical Priniciples for the All Data Revolution
Ethical Priniciples for the All Data RevolutionEthical Priniciples for the All Data Revolution
Ethical Priniciples for the All Data RevolutionMelissa Moody
 
Steering Model Selection with Visual Diagnostics
Steering Model Selection with Visual DiagnosticsSteering Model Selection with Visual Diagnostics
Steering Model Selection with Visual DiagnosticsMelissa Moody
 
Assessing the reproducibility of DNA microarray studies
Assessing the reproducibility of DNA microarray studiesAssessing the reproducibility of DNA microarray studies
Assessing the reproducibility of DNA microarray studiesMelissa Moody
 
Modeling the Impact of R & Python Packages: Dependency and Contributor Networks
Modeling the Impact of R & Python Packages: Dependency and Contributor NetworksModeling the Impact of R & Python Packages: Dependency and Contributor Networks
Modeling the Impact of R & Python Packages: Dependency and Contributor NetworksMelissa Moody
 
How to Beat the House: Predicting Football Results with Hyperparameter Optimi...
How to Beat the House: Predicting Football Results with Hyperparameter Optimi...How to Beat the House: Predicting Football Results with Hyperparameter Optimi...
How to Beat the House: Predicting Football Results with Hyperparameter Optimi...Melissa Moody
 
A Modified K-Means Clustering Approach to Redrawing US Congressional Districts
A Modified K-Means Clustering Approach to Redrawing US Congressional DistrictsA Modified K-Means Clustering Approach to Redrawing US Congressional Districts
A Modified K-Means Clustering Approach to Redrawing US Congressional DistrictsMelissa Moody
 
Joining Separate Paradigms: Text Mining & Deep Neural Networks to Character...
Joining Separate Paradigms:  Text Mining & Deep Neural Networks  to Character...Joining Separate Paradigms:  Text Mining & Deep Neural Networks  to Character...
Joining Separate Paradigms: Text Mining & Deep Neural Networks to Character...Melissa Moody
 

Plus de Melissa Moody (16)

Data Collection Methods for Building a Free Response Training Simulation
Data Collection Methods for Building a Free Response Training SimulationData Collection Methods for Building a Free Response Training Simulation
Data Collection Methods for Building a Free Response Training Simulation
 
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
 
Deep Learning Meets Biology: How Does a Protein Helix Know Where to Start and...
Deep Learning Meets Biology: How Does a Protein Helix Know Where to Start and...Deep Learning Meets Biology: How Does a Protein Helix Know Where to Start and...
Deep Learning Meets Biology: How Does a Protein Helix Know Where to Start and...
 
Automatic detection of online abuse and analysis of problematic users in wiki...
Automatic detection of online abuse and analysis of problematic users in wiki...Automatic detection of online abuse and analysis of problematic users in wiki...
Automatic detection of online abuse and analysis of problematic users in wiki...
 
Plans for the University of Virginia School of Data Science
Plans for the University of Virginia School of Data SciencePlans for the University of Virginia School of Data Science
Plans for the University of Virginia School of Data Science
 
Wikimedia Foundation, Trust & Safety: Cyber Harassment Classification and Pre...
Wikimedia Foundation, Trust & Safety: Cyber Harassment Classification and Pre...Wikimedia Foundation, Trust & Safety: Cyber Harassment Classification and Pre...
Wikimedia Foundation, Trust & Safety: Cyber Harassment Classification and Pre...
 
Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in De...
Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in De...Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in De...
Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in De...
 
Collective Biographies of Women: A Deep Learning Approach to Paragraph Annota...
Collective Biographies of Women: A Deep Learning Approach to Paragraph Annota...Collective Biographies of Women: A Deep Learning Approach to Paragraph Annota...
Collective Biographies of Women: A Deep Learning Approach to Paragraph Annota...
 
Introduction to Social Network Analysis
Introduction to Social Network AnalysisIntroduction to Social Network Analysis
Introduction to Social Network Analysis
 
Ethical Priniciples for the All Data Revolution
Ethical Priniciples for the All Data RevolutionEthical Priniciples for the All Data Revolution
Ethical Priniciples for the All Data Revolution
 
Steering Model Selection with Visual Diagnostics
Steering Model Selection with Visual DiagnosticsSteering Model Selection with Visual Diagnostics
Steering Model Selection with Visual Diagnostics
 
Assessing the reproducibility of DNA microarray studies
Assessing the reproducibility of DNA microarray studiesAssessing the reproducibility of DNA microarray studies
Assessing the reproducibility of DNA microarray studies
 
Modeling the Impact of R & Python Packages: Dependency and Contributor Networks
Modeling the Impact of R & Python Packages: Dependency and Contributor NetworksModeling the Impact of R & Python Packages: Dependency and Contributor Networks
Modeling the Impact of R & Python Packages: Dependency and Contributor Networks
 
How to Beat the House: Predicting Football Results with Hyperparameter Optimi...
How to Beat the House: Predicting Football Results with Hyperparameter Optimi...How to Beat the House: Predicting Football Results with Hyperparameter Optimi...
How to Beat the House: Predicting Football Results with Hyperparameter Optimi...
 
A Modified K-Means Clustering Approach to Redrawing US Congressional Districts
A Modified K-Means Clustering Approach to Redrawing US Congressional DistrictsA Modified K-Means Clustering Approach to Redrawing US Congressional Districts
A Modified K-Means Clustering Approach to Redrawing US Congressional Districts
 
Joining Separate Paradigms: Text Mining & Deep Neural Networks to Character...
Joining Separate Paradigms:  Text Mining & Deep Neural Networks  to Character...Joining Separate Paradigms:  Text Mining & Deep Neural Networks  to Character...
Joining Separate Paradigms: Text Mining & Deep Neural Networks to Character...
 

Dernier

Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...kumargunjan9515
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numberssuginr1
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...Health
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...kumargunjan9515
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...HyderabadDolls
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdfkhraisr
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...HyderabadDolls
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...HyderabadDolls
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...SOFTTECHHUB
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfSayantanBiswas37
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...nirzagarg
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样wsppdmt
 

Dernier (20)

Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 

Connecting citizens with public data to drive policy change

  • 1. Developing a data pipeline to improve accessibility and utilization of Charlottesville’s Open Data Portal Lucas Beane, Elena Gillis, Rafael Alvarado, Caitlin Wylie University of Virginia, lhb7tz, emg3sc, rca2t, cdw9y@virginia.edu Abstract – To improve democratic engagement between the people and the government, the city of Charlottesville put forward a proposition to construct an online portal that would contain data from the city departments that is considered public by nature. This move was intended to promote the ease of access to data pertinent to ongoing policy debates in the city and incentivize the public to contribute to the policy-making process with informed participation. Such efforts, while successful at their start, have gradually stagnated, and the end objective of the portal has not been reached. In this paper we identify possible reasons for this stagnation – inconsistent formatting of the datasets, variables that are not meant for human legibility, and limited data with disproportional representation from the city departments. We then propose a data pipeline that serves as a tool to extract utility from the data. It does so by converting the datasets into a consistent format, merges the datasets, and allows for creation of simple visualizations. The pipeline acts as a link between the raw data published by the government units and the city by increasing its interpretability and legibility and outputting results that are easily relatable to the policy issues at hand. We demonstrate this by analyzing datasets for crime and real estate and relating our findings to the affordable housing debate. Index Terms – open data, Charlottesville, informed policy- making, data visualization, data pipeline. INTRODUCTION Civic data portals have become a common feature of local governance across the globe, and in response to requests from the community, the City of Charlottesville made a collective decision to disclose city department data to the public. Launching the Charlottesville Open Data Portal was a socially and strategically important action meant to democratize the policy-making processes in the city by enabling informed public participation. Despite this collective effort, the portal remains largely unutilized by the public. We primarily attribute this trend to the fragmented nature of the datasets, inconsistency in variable conventions, and limits of available information. In our project, we aim to increase usability and utilization of disclosed data by comprehensively combining existing datasets to draw additional insights, make them easily accessible to the public through visualizations on an interactive dashboard, and demonstrate how such information can be used to inform policy decisions. To emphasize these points, we construct a data pipeline that combines data from the portal into an incorporated dataset that takes all these goals into account. In addition, by looking at the portal through a spectrum of user perspectives, ranging from data experts to non-experts, we develop a series of actionable suggestions with the goal of improving the overall structure and use of the portal. We created a pipeline for dealing with similar data types and merging individual datasets. We developed a prototypical model for combining different types of geospatial data, which ultimately outputs a result in the form of a census block (the most granular measure attainable with our data) for each entry with a geospatial location. We aim to expand this sort of model to other data formats for easier comparison between datasets. Additionally, we kept track of any hurdles we encountered that might discourage a non-data scientist from working with data on the portal. Although open-access tools exist to interact with the portal’s datasets, they aren’t always as reliable or powerful as users would like. In addition, directly downloading the data (to use with third party programs) is not always an option due to inconsistent formatting in data and astronomically sized datasets. By noting these hurdles, we will forward our suggested solutions to the city staff who maintain the portal, with the goal of actionable policy revision. Our pipeline ensures that the data is cleaned and in a consistent format, allowing us to visualize each dataset individually and overlay the results to observe any patterns in the data. We then relate the findings to ongoing policy debates pertaining to the city’s development. The ultimate goal of this project is to show the city of Charlottesville how the Open Data Portal can create a positive impact and incentivize greater citizen participation in the policy-making process. BACKGROUND I. Charlottesville’s Portal In late 2016, after months of solicitation by the city’s tech sector, Charlottesville City Council passed a resolution to begin and support an open data initiative for the city. The main goals of the project were to let the public have an eye on what was happening data-wise in the city and, in doing so, to enable citizens to have more of a say in their local government through collaborative efforts. To set up the portal, seven city employees were assigned to the newly formed Open Data Committee as architects and developers of the portal with five local citizens acting as a
  • 2. review committee deemed the Open Data Advisory Group. The ODC collected datasets from various public institutions around Charlottesville while performing pre-processing (e.g. anonymizing data, removing violent crime data and active investigations, and correcting obvious errors) so that having a public eye on the data would be unproblematic. Though the city maintains that the purpose of the portal is to get citizens more engaged in local government, many of their efforts to make the portal more accessible have been relatively unsuccessful. When asked if the ODC planned to teach the public how to use the portal, one member stated that “it’s not for novices” and pointed technological hopefuls in the direction of D.C.’s open data portal for training [1]. Unfortunately, once the data has been published to the portal, it was not seen by the city as their responsibility to teach the public how to utilize it to its potential, which lead to the current predicament. In August of 2017, the Charlottesville Open Data Portal had its soft launch, which was eclipsed by the tragedy of a violent white nationalist rally a few days prior. We partially explain the lack of public awareness of the portal through the domination of this event in local attention and the news cycle, but we also acknowledge the difficulty in promoting a digital platform to a public audience. II. Other Cities’ Portals Other cities in Virginia have implemented open data portals, and, through using them, some have had tremendous successes with implementing positive policy change. For example, Lynchburg utilized direct citizen input to help with their “Poverty to Progress” plan, a goal of which was to map food deserts in order to inform non-profits which areas need the most help [2]. In an endeavor to improve transparency, Virginia Beach created web apps to get citizens’ direct input on budget proposals as part of their Information Technology department’s 10 strategic goals [2]. These two cities garnered national attention when they were among the first-place winners of the Center for Digital Government’s 2017 Digital Cities Survey for their efforts [2]. Cities across the U.S. were recognized in this competition, and they all had one thing in common: they had specific problems they wanted to address with their data portal. Though some agencies in Charlottesville use the data portal in such a way, we find no evidence that the city of Charlottesville invites the public to work with them on these specific uses, such as the cities of Lynchburg and Virginia Beach do. We believe this to be part of the explanation behind the portal’s under-utilization. On a different scale but with similar problems, the national open data portal had a similar start: “In May, 2009, with just 47 datasets. It was not an instant hit” [3]. That being said, the national data portal now has over 230,000 datasets, an indicator that its utility is widely recognized. This narrative is inspiring for Charlottesville’s portal, as it provides hope for an achievable goal while still allowing for a period of stagnation. PROBLEM DESCRIPTION Democracy and transparency in policy-making were identified by the creators as the main objectives of the portal, however, as of now, the portal is not being utilized to its full potential. In this paper we aim to outline the stages of building a portal and identify the main reasons pertaining to this issue. We then attempt to propose possible solutions that would tackle these hurdles. We identified three main stages of portal development: collection, distribution, and presentation of data. The collection of data is performed by the Open Data Committee (ODC) that consists of representatives from the city staff. The ODC works with the heads of city departments to obtain the data. However, this is complicated because many view sharing their data as a personal liability as the data is frequently sensitive and there are no apparent immediate benefits to its disclosure. Staff in the city’s tech departments clean and publish the collected data using Esri’s ArcGIS platform. Since the data is collected from different sources that utilize various collection methods and use the data for different purposes, the published datasets are incompatibly structured and often cannot be combined to derive additional information. Presentation of the data is primarily done by so-called “super users” of the portal – people with strong technical backgrounds who use the published data to create visualizations and draw additional insights. This is where our project comes in—we aim to make the published data more comprehensible and accessible as well as create a blueprint for working with new data as it is published on the portal. Considering the immense scope of this problem, we focus on two primary objectives: identifying the main reasons behind why the portal is not being widely utilized and tackling these issues by discovering the hurdles that may hinder users. To increase usability and interpretability of the data, we develop a data pipeline to demonstrate utilization of the published data to produce actionable information to inform policy decisions. Our work can demonstrate to the city (staff, users and municipal departments) how disclosing public data can create positive impact, thereby incentivizing more government units to share data. DATA Currently there are 81 datasets published on the portal that are divided into ten categories: Property and Land, Economy, City Operations, Public Safety, Demographics, Getting Around, Recreation, Infrastructure, Environment, GIS Base Layers. There are over three million observations across all the datasets with over three hundred variables, with about 75% of this data containing geospatial information. Although the datasets are largely geospatial, this information is presented in different formats, e.g., coordinates, parcel numbers, census tracts and block numbers, and physical addresses. This generates a layer of complication when merging the datasets, as the geographical measures do not always overlap. In addition, the same variables in different
  • 3. datasets have different interpretations, depending on the source and purpose of the data, and in the absence of a data dictionary they are hard to distinguish and define. The opposite is also true – some variables containing similar and integrable information had inconsistent names and, as a result, were difficult to identify and overlay. APPROACH I. Data Discovery At his stage data collection and maintenance of the portal are no one’s official responsibility and is performed purely on a volunteer basis, as was revealed through a meeting with a representative of the Open Data Advisory Group. Given the lack of any imminent positive impact from the portal’s initial opening, the enthusiasm for open data in Charlottesville has substantially diminished and at times the efforts to improve the portal are put completely on hold, which hinders any further attempts at expanding what the portal can accomplish. Most of the portal’s geospatial data is designed for easy integration with GIS, which was pointed out by UVA’s Scholars Lab, but that characteristic is not apparent and requires advanced technical skills that are not common among the general public. Nonetheless, viewing the data as integrable with GIS helps define some of the variables and expand the amount of utilizable information in the dataset. The design of the datasets and lack of interpretability of the variables, as well as disproportional representation of data across departments is explained by the fact that most of the published data was already collected and available in the city departments and, as such, formatted to the departments’ initial needs. No new data is recorded and formatted specifically for publication on the portal and some departments that had initially committed to sharing their data started to worry about the risks of sharing it without knowing who will use the disclosed information and to what end. In addition, publication of the data is seen by the city as the final objective, leaving it to the public to derive their own utility. II. Data Flow Figure 1 below shows a flowchart of the three stages of the portal development. The first two stages - collection and distribution of the data - are performed by the Open Data Committee and the city staff and thus are outside the scope of this project. Instead, we offer additional steps that process the given data and output user-friendly visualizations that could increase the usability of the portal and contribute to its initial goal of a democratized and informed policy-making process. PIPELINE Our main goal was to develop an architecture for a data pipeline, to transform data currently on the portal to accessible information. We aim to create a model that could be implemented by anyone with some language-agnostic technological knowledge, with the ultimate output of a tool that could be utilized by anyone. Along the way, we also take note of any hurdles we encounter that might discourage other users from working toward a similar goal. To begin, we utilize a simple web scraper using the Python package Selenium that grabs the dataframes from the portal and stores them locally as a series of files in CSV format. We then clean the data using a variety of methods, primarily using typical packages in Python. The most relevant approach we take is to assign a confidence level from 1 to 4 to each feature from every dataframe to represent how confident we are about the entries’ meanings. We then only keep features with confidence levels of 3 or 4 (the levels representing the most confidence), discarding the rest. This approach is a direct result of the problem of lack of documentation on the portal; a more comprehensive data dictionary would eliminate the need for this entirely. At this point, we also make sure to rename our features according to their perceived meaning, both consolidating similar features (e.g., changing geospatial data named “X” and “Y” to “Longitude” and “Latitude”) and separating features with deceptively similar names (e.g., “BlockNumber” to “FIPS Block Code” and “Street Number”). We choose to further amalgamate our geospatial data into a single format—the census block number—to make the visualization step simpler, acknowledging that this sacrifices precision in favor of aggregation. With the data cleaned, we use the pandas package in Python to merge similar dataframes on their shared features (e.g. “Longitude” and “Latitude”), only keeping desired features in the resulting merge. We primarily work with the geospatial data at this step, but this could easily be applied to any type of feature. At this point, any desired analysis could be performed, and due to the nature of our project, we do so in the form of visualizations. We choose this medium because our work up to this point is with geospatial data, which would be intuitive to analyze visually. We also want to use the visuals as an intuitive way of working toward our ultimate goal of increasing accessibility and use of the portal. FIGURE 1 THREE STAGES OF PORTAL DEVELOPMENT
  • 4. RESULTS I. Hurdles to a Portal User While working on the data pipeline, we took note of multiple obstacles in utilizing the data. Firstly, there is an uneven representation of data from different city departments. We attribute this lack of proportionality to three causes: city staff’s persisting reluctance to share their data, obscurity in the existing data that diminishes its utility, and disparate practices of data collection across city departments. This results in inconsistency of data formatting, since, as of now, there are no guidelines for generating the data submitted to the portal. Data submission is instead performed on a voluntary basis and any data is seen as good data as long as it is disclosed to the public. This practice makes it difficult for both humans and machines to read the data, especially with the lack of comprehensive data dictionaries for the published datasets. There also exist multiple technical issues with the portal. At times some datasets are unavailable for download. While some became open after a period of time, a few still remain inaccessible. Among the datasets that we were able to scrape, some are too large to be easily read by home computers and standard software. In cases when these datasets are readable, it is still difficult to identify significant observations and variables. We believe that the ability to do so is important given the main objective of the portal. II. Examples and Visualizations Our data pipeline utilizes the raw data published on the portal and creates a baseline for informed policy debates. It standardizes geospatial variables for ease of dataset integration and mergeability then outputs visualizations that provide insightful summaries for public users of the portal. To demonstrate this capability, we use real estate and crime rate data to produce relevant visualizations and show how these results can be used in the ongoing policy debate for affordable housing in Charlottesville. In a visualization of total real estate value in Charlottesville since 1997, one can immediately glean a few insights. For example, as the recession hit the city in 2009, the sum value of properties in Charlottesville took a hit. Only recently did the city gain back its sum momentum, but it seems to have accomplished this well based on the trend before 2009. In addition, some of the highest-valued properties in Charlottesville are labeled along the right, and one can easily see that beside the hospitals, park, and shopping center, the top-valued properties in Charlottesville are apartment complexes that have sprung up since 2011. Together, these conclusions could potentially lead a user toward a hypothesis that companies took advantage of lower cost property values during the recession to buy, develop, and market upper end housing, further fueling the gentrification process in Charlottesville. Visualizing crime data alongside the real estate data helps point out trends in distribution of crime and the geospatial pockets in which certain types of crime is prevalent. Although crime dataset is one of the most comprehensible and well formatted datasets in the portal we still see inherent misrepresentation of values. For instance, geospatially we can observe that many data points are concentrated around the police department in downtown Charlottesville. We believe that such concentration can be caused by two reasons – the police station is recorded as the location of the violation for the crimes that are charged at the station, or the station address is used as a default in the absence of another address. As a result, in our geospatial analysis of crime data we exclude the violations that have the police station as their location of record to give more weight to the observations that occurred at other locations. The data points overlaid on the physical map of Charlottesville illustrate density of crimes across the city as well as show pockets of concentration for specific types of violations. Such data, plotted over time, can be of significance in analyzing effects in implementation of new policies that aim to reduce crime. In the area diagram plotted across the time of day, we can see that Towed Vehicles is the most prevalent type of crime recorded in the afternoons, while Simple Assault is more common before 10 a.m. In addition, we see a sudden drop of crime-recording practices during lunch hours. We believe that either fewer offenses were penalized during that time or they were not logged until later in the afternoon. In a heatmap for 20 most common crimes logged in Charlottesville over the span of the day, we can see similar patterns (Appendix 2). Crime pattern analysis over the months of the year identifies the prevalence of crimes such as Towed Vehicles and Simple Assault in the months of September and October, which could be explained by UVA students returning to the campus and the beginning of the football game season. By observing monthly patterns, the city can develop strategies to respond to the violations accordingly. CONCLUSION We successfully built a data pipeline that takes the data from the portal as input and outputs actionable information in an accessible way. We created a few sample visualizations to showcase what this sort of process can achieve. We’ve also outlined our process such that anyone who wants to either replicate or make any adjustments to our process can do so in any desired language. We encountered myriad problems that could potentially explain the dearth of users of the portal. Chief among them is the lack of interpretability of the data; though the datasets have descriptions, they lack extensive data dictionaries, so many of the features therein are unusable. This is likely due to the data collection policy, which has few guidelines to allow for consistency across datasets. Additionally, we found a disproportionate number of representative datasets for some departments, perhaps because they were willing and able to offer up more data. If a user isn’t interested in the over- represented departments, this trend could disenfranchise them simply because the available data do not form a reliable picture of Charlottesville. There also still exist simple
  • 5. problems of accessibility; at times, some datasets on the portal are not available. On top of that, several datasets are simply too large for the average home computer to utilize. Addressing these problems would be an important first step toward achieving a wider user base for Charlottesville’s open data portal. REFERENCES [1] Wylie, C., Neeley, K., Ferguson, S. 2019. “Beyond Technological Literacy: Open Data as Active Democratic Engagement?”, accepted for publication in Digital Culture and Society, January 2019. [2] Erepublicnews.com. 2019. Digital Cities Winners Leverage Technology to Enhance Inclusion, Solve Social Challenges. http://erepublicnews.com/digital-cities-winners-leverage-technology- to-enhance-inclusion-solve-social-challenges. [3] Forbes.com. 2019. States Offer Information Resources: 50+ Open Data Portals. https://www.forbes.com/sites/metabrown/2018/04/30/us- states-offer-information-resources-50-open-data- portals/#18f5850c5225. AUTHOR INFORMATION Lucas Beane, MSDS Student, Data Science Institute, University of Virginia. Elena Gillis, MSDS Student, Data Science Institute, University of Virginia. Rafael Alvarado, Assistant Director, Data Science Institute, University of Virginia. Caitlin Wylie, Assistant Professor, School of Engineering and Applied Science, University of Virginia. APPENDICES Appendix 1 Appendix 2