Course Schedule
The schedule is tentative, we may arrange a few visits to some organizations. Details TBD.
Introduction
- Week 1 8/24: Why this course?
- Week 2 8/31: Data management and life cycle
- Week 3 9/7: Documenting data and version control
Understanding Data
- Week 4 9/14: Data structure, relational database, and data dictionary
- Week 5 9/21: Data types and data visualization and interaction
Data Acquisition and Preprocessing
- Week 6 9/28: Acquiring data: open data and open-source intelligence (guest speaker)
- Week 7 10/5: Text and relation as data
- Week 8 10/12: Data cleaning, preprocessing, and organizing + Data security
- Week 9 10/19: Field visit: Dress for Success
- Week 10 10/26: Standardization and automation
Data Management and Social Science Research
- Week 11 11/2: Concepts and measures in social sciences (guest speaker)
- Week 12 11/9: Data reuse and data governance
- Week 13 11/16: Final project workday - no class, the week before Thanksgiving
- Week 14 11/30: From empirical study to theory building. Final project presentation.
Week 0 Pre-course Back2Top
- Learning modules
- Introduction to Python
- Recommended: Introduction to Shell
- Pre-course survey
Week 1: Why this course? Back2Top
Before class
- Readings:
- Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature News, 533(7604), 452. doi:10.1038/533452a.
- Briney, K. (2015). The Data Problem. In Data management for researchers: Organize, maintain and share your data for research success. Research Skills Series (Exeter, England). HOLLIS number:014921191. Exeter, UK: Pelagic Publishing.
- Gentzkow, M., & Shapiro, J. M. (2014). Introduction. In Code and data for the social sciences: A practitioner’s guide.
- Lazer, D., Pentland, A., Adamic, L., Aral, S., Barab´asi, A.-L., Brewer, D., . . . Alstyne, M. V. (2009). Computational Social Science. Science, 323(5915), 721–723. doi:10.1126/science.1167742.6
In class
- Discussion and lecture on readings.
- Course review: Syllabus, assignments, final project.
After class
- Learning modules: Intermediate Python
- Dictionaries & Pandas
Week 2: Data management and life cycle Back2Top
Before class
- Readings:
- Briney, K. (2015). Planning for Data Management. In Data management for researchers: Organize, maintain and share your data for research success. Research Skills Series (Exeter, England). HOLLIS number: 014921191. Exeter, UK: Pelagic Publishing.
- Briney, K. (2015). The Data Lifecycle. In Data management for researchers: Organize, maintain and share your data for research success. Research Skills Series (Exeter, England). HOLLIS number: 014921191. Exeter, UK: Pelagic Publishing.
- Ruane, J. M. (2016). Designing Ideas: What Do We Want to Know and How Can We Get There? In Introducing Social Research Methods: Essentials for Getting the Edge (pp. 67–92). Chichester, West Sussex, UK ; Hoboken, NJ: John Wiley & Sons Inc.
In class
- Profile for group matching (“Student Profile”)
- Discussion and lecture on readings.
- Review final project possibilities.
After class
- Learning modules: Intermediate Python
- Logic, Control Flow and Filtering
- Assignment 1: Plagiarism test (10% points)
Week 3: Documenting data and version control Back2Top
Before class
- Readings:
- Briney, K. (2015). Documentation. In Data management for researchers: Organize, maintain and share your data for research success. Research Skills Series (Exeter, England). HOLLIS number:014921191. Exeter, UK: Pelagic Publishing.
- Broman, K. W., & Woo, K. H. (2017). Data organization in spreadsheets (tech. rep. No. e3183v1). PeerJ Inc. doi:10.7287/peerj.preprints.3183v1.
- Gentzkow, M., & Shapiro, J. M. (2014). Version Control. In Code and data for the social sciences: A practitioner’s guide.
In class
- Discussion and lecture on readings.
- Student presentation.
- Group discussion on client projects.
After class
- Learning modules:
Week 4: Data structure, relational database, and data dictionary Back2Top
Before class
- Readings:
- Wickham, H. (2014). Tidy data. The Journal of Statistical Software, 59(10). http://www.jstatsoft.org/v59/i10/
- Normalization of Database
- Gentzkow, M., & Shapiro, J. M. (2014). Keys. In Code and data for the social sciences: A practitioner’s guide.
In class
- Discussion and lecture on readings.
- Group practice: Form to Table (Your annual happy-hour/nightmare: Form 1040)
- Student presentation.
- Group discussion on client projects.
After class
- Learning modules: Introduction to Importing Data in Python
- Further readings:
- Swaroop C. H. (2013). Data Structures. In A Byte of Python.
Week 5: Data types and data visualization and interaction Back2Top
Before class
- Readings:
- Kirk, A. (2019). Working With Data. In Data Visualisation: A Handbook for Data Driven Design (2nd edition, pp. 95–117). SAGE Publications Ltd.
- Kirk, A. (2019). The Visualisation Design Process. In Data Visualisation: A Handbook for Data Driven Design (2nd edition, pp. 31–58). SAGE Publications Ltd.
In class
- Discussion and lecture on readings.
- Student presentation: Hajiyeva.
- Group discussion on client projects.
After class
- Assignment 3 [1/2]: Customized learning - Planned chapters (3% points)
- Assignment 5 [1/3]: Client project - Project contract draft (5%)
- Data Visualization for Everyone
Week 6: Acquiring data: open data and open-source intelligence Back2Top
Before class:
- Readings
- Williams, H. J., & Blum, I. (2018). Defining Second Generation Open Source Intelligence (OSINT) for the Defense Enterprise. RAND Corporation.
- Review Bing News Search API: what can you do with it?
In class:
- Discussion and lecture on readings.
- Open dataset / portal examples
- Group discussion on client projects.
- Hands-on: Bing News Search API (if time allows).
After class
- Assignment 5 [2/3]: Client project - Project contract final (5%)
- Learning modules:
Week 7: Text and relation as data Back2Top
Before class
- Readings:
- Grimmer, J., &Stewart, B. M. (2013). Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts. Political Analysis, 21(3), 267–297. doi:10.1093/pan/mps028.
- Provan, K. G., Veazie, M. A., Staten, L. K., & Teufel-Shone, N. I. (2005). The use of network analysis to strengthen community partnerships. Public Administration Review, 65(5), 603–613.
In class:
- Discussion and lecture on readings.
- Group discussion on client projects.
After class
- Learning modules:
- Further readings:
- Borgatti, S. P., & Foster, P. C. (2003). The Network Paradigm in Organizational Research: A Review and Typology. Journal of Management, 29(6), 991–1013. doi:10.1016/S0149-20630300087-4.
Week 8: Data cleaning, preprocessing, and organizing + Data security Back2Top
Before class
Data cleaning, preprocessing, and organizing
- Briney, K. (2015). Organization. In Data management for researchers: Organize, maintain and share your data for research success. Research Skills Series (Exeter, England).
- Gentzkow, M., & Shapiro, J. M. (2014). Directories. In Code and data for the social sciences: A practitioner’s guide.
- Miksa, T., Simms, S., Mietchen, D., & Jones, S. (2019). Ten principles for machine-actionable data management plans. PLOS Computational Biology, 15(3), e1006750. doi:10.1371/journal.pcbi.1006750.
Data security
- Briney, K. (2015). Managing sensitive data. In Data management for researchers: Organize, maintain and share your data for research success. Research Skills Series (Exeter, England). HOLLIS number: 014921191. Exeter, UK: Pelagic Publishing.
- Case: UT Data Classification Standard
In class:
- Discussion and lecture on readings.
- Student presentation: Barroso.
- Group discussion on client projects.
After class
Data cleaning, preprocessing, and organizing
- Learning modules:
Data security
- Learning modules:
Week 9: Field visit: Dress for Success (3000 S I-35 Frontage Rd Suite 180, Austin, TX 78704) Back2Top
Schedule
- 2-3pm:
- 3-4:30pm:
Week 10: Standardization and automation Back2Top
Before class
- Required readings:
- Gentzkow, M., & Shapiro, J. M. (2014). Automation. In Code and data for the social sciences: A practitioner’s guide.
- Wilson, G., Bryan, J., Cranston, K., Kitzes, J., Nederbragt, L., & Teal, T. K. (2017). Good enough practices in scientific computing. PLOS Computational Biology, 13(6), e1005510. doi:10.1371/journal.pcbi.1005510.
- de Visser C, Johansson LF, Kulkarni P, Mei H, Neerincx P, Joeri van der Velde K, et al. (2023) Ten quick tips for building FAIR workflows. PLoS Comput Biol 19(9): e1011369. https://doi.org/10.1371/journal.pcbi.1011369
- Recommended readings:
- Wilkinson et al. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3, 160018. doi:10.1038/sdata.2016.18.
- Wilson, G., Aruliah, D. A., Brown, C. T., Hong, N. P. C., Davis, M., Guy, R. T., . . . Wilson, P. (2014). Best Practices for Scientific Computing. PLOS Biology, 12(1), e1001745. doi:10.1371/journal.pbio.1001745.
- Gentzkow, M., & Shapiro, J. M. (2014). Appendix: Code Style. In Code and data for the social sciences: A practitioner’s guide.
In class
- Discussion and lecture on readings.
- Good enough practices in final project
- Student presentation: Ramarao, Wang.
- Group discussion on client projects.
After class
Work on your customized learning modules.
Week 11: Concepts and measures in social sciences Back2Top
Before class
- Required readings:
- Ruane, J. M. (2016). All That Glitters Is Not Gold: Assessing the Validity and Reliability of Measures. In Introducing Social Research Methods: Essentials for Getting the Edge (pp. 117–138). Chichester, West Sussex, UK ; Hoboken, NJ: John Wiley & Sons Inc.
- Ruane, J. M. (2016). Measure by Measure: Developing Measures—Making the Abstract Concrete. In Introducing Social Research Methods: Essentials for Getting the Edge (pp. 93–116). Chichester, West Sussex, UK ; Hoboken, NJ: John Wiley & Sons Inc.
- Shoemaker, P. J., Tankard, J. W., & Lasorsa, D. L. (2003). Theoretical Concepts: The Building Blocks of Theory. In How to Build Social Science Theories (pp. 15–36). SAGE Publications.
- Recommended readings:
- Gerring, J. (1999). What Makes a Concept Good? A Criterial Framework for Understanding Concept Formation in the Social Sciences. Polity, 31(3), 357–393. doi:10.2307/3235246.
In class
Guest speaker: Lifelong Learning with Friends (2pm)
For the client (30 mins with Q&A):
- What and how data are generated in daily operations.
- How data are processed, shared, and collaborated in daily operations.
- What are the relations between data and business, and who are the users of the data.
For the student team** (20 mins with Q&A)
- Where is the niche for the team to fit in?
- What are the deliverables and how they can be useful?
Weekly class activities
- Discussion and lecture on readings.
- Student presentation: Raza, Chavero.
- Group discussion on client projects.
Week 12: Data reuse and data governance Back2Top
Before class
- Readings:
- Briney, K. (2015). Data reuse and restarting the data lifecycle. In Data management for researchers: Organize, maintain and share your data for research success. Research Skills Series (Exeter, England).
- Ghavami, P. (2020). Data Governance and Data Security. In Big Data Management: Data Governance Principles for Big Data Analytics. De Gruyter.
In class
- Discussion and lecture on readings.
- Student presentation: Vanegas, Zhang.
- Group discussion on client projects.
After class
Make sure you and your team are on track of all outstanding assignments.
Week 13: Final project workday - no class (week before Thanksgiving) Back2Top
Assignment due:
Also work on any outstanding assignments:
- Analysis of empirical studies and presentation.
- Client project.
Week 14: From empirical study to theory building. Final project presentation Back2Top
Before class
- Readings:
- Creswell, J. W. (2014). The Use of Theory. In Research design: Qualitative, quantitative, and mixed methods approaches (4th ed). Thousand Oaks: SAGE Publications.
- Sutton, R. I., & Staw, B. M. (1995). What Theory is Not. Administrative Science Quarterly, 40(3), 371–384. doi:10.2307/2393788.
- Shoemaker, P. J., Tankard, J. W., & Lasorsa, D. L. (2003). Theoretical and Operational Linkages. In How to Build Social Science Theories. SAGE Publications.
- (review the article wrote by the authors in Week 1) Lazer, D. M. J., Pentland, A., Watts, D. J., Aral, S., Athey, S., Contractor, N., Freelon, D., Gonzalez-Bailon, S., King, G., Margetts, H., Nelson, A., Salganik, M. J., Strohmaier, M., Vespignani, A., & Wagner, C. (2020). Computational social science: Obstacles and opportunities. Science, 369(6507), 1060–1062. https://doi.org/10.1126/science.aaz8170
In class
- Discussion and lecture on readings.
- Final project presentations.
After class
- Further readings:
- Shoemaker, P. J., Tankard, J. W., & Lasorsa, D. L. (2003). Creativity and Theory Building. In How to Build Social Science Theories (pp. 145–166). SAGE Publications.
- Assignment 5 [3/3]: Client project - Final report and presentation (30%)
- Also make sure you clear outstanding submissions for these assignments: