During the summer of 2015, we were selected to participate in the University of Washington’s eScience Institute Data Science for Social Good summer incubator program, where we worked with an interdisciplinary group of graduate student fellows and faculty research scientists to develop an initial prototype of our proposed community well-being report pages.
Here are the wrap up presentation slides, presented on the final day of the event.
Our code is up on github at https://github.com/ShellyDianeFarnham/CommunityWellbeing
The prototype is in “closed alpha” for now, but email me (Shelly) at shelly at thirdplacetechnologies dot com if you wan to check it out.
It was a very intense summer, and I can’t thank our student fellows enough for all their hard work. It was a lot of data to pull together in a meaningful way in a short amount of time, and it was immensely valuable to have such an interdisciplinary team focusing on the problem — Ryan Burns’ perspective as a Geography expert, Jenny Ho’s experience with economics analysis, Jordan Bates’s background in computer science and applied math, Yue Zhou’s patience wrangling together and analyzing King County crime data, and our high school students’ (Avery Glassand Jennifer Nino) adventurous spirits in getting feedback from neighborhood residents and processing Facebook data. I also want to thank our DSSG staff mentor Bernease Herman for all her help in mentoring the students in working through their individual problems, and the DSSG crew’s hard work (especially Sarah and Micaela) in providing such a fabulous program connecting data science students with organizations like ours.
Building on our recent prior work examining the use of social media and open data analytics to support hyperlocal community awareness and civic engagement around local issues, we are creating a new experimental third place system – Spokin — to incorporate community self-assessment metrics with identity management tools and situated communication channels that encourage citizen response. The primary goal of Spokin is to enable community organizers and everyday citizens to leverage new affordances in social media, open data, and situated communication channels for ongoing situated community self-awareness around issues affecting their well-being, and immediate, intelligent, collective issue response. A key focus of this work expected to have both intellectual merit and broader impact, is the development of a dynamic Community Well-being Index and a Community Hubs Index, based on data analytics integrating social media and open data. Given the importance of being able to engage with community messages in situ, another longer-term objective of this line of work is to explore new opportunities for citizens to interact with the neighborhood content through mobile and embedded devices.
Summary of our summer project:
For this project we integrated several social media and open data sources to develop predictors of community well-being in King County neighborhoods and cities, including neighborhood Twitter activity and content analysis, activity in Facebook groups and pages, Yelp activity, crime statistics, and census data. We then used machine learning and hierarchical regression analysis techniques to develop a measurement model, using existing survey data from the Happiness Initiative (Musikanski, 2013) as our ground truth dependent variable, which includes a self-report community well-being measure aggregated to the level of King County zip codes. Based on these findings we then developed summary measures of social vitality, thriving third places, population investment, socio-economic status, diversity, and stress (based on weighted, linear combinations of statistically significant features), out of which we further created an overall Community Well-being Index.
Our preliminary results were promising, with our overall Community Well-being Index (based entirely on social media and open data) correlating with the Happiness Initiative self-reported community well-being measure at r = .65, p < .000, N = 167. However we encountered complex interaction effects that warrant further analysis with a larger sample size. For example, we found that while racial diversity overall negatively correlated with community well-being, for minorities in inclusive neighborhoods community well-being was especially high. While provocative, we need more statistical power to have confidence in these findings. Further, the Happiness Initiative survey data was collected online, which raises sampling concerns, and aggregates only to the zip code level. An important next step is to develop a valid ground truth community well-being data set at the neighborhood level to further develop our models.
You may also find a summary of the project on the DSSG blog.