Monday, 10 September 2018
Whet your analytic appetite
- Data analytics: What data? How might I analyze it?
- Is this a field that fits my interests and temperaments?
- Discuss example questions, skills, and competencies assocaited with each of the three core domains of Data Analytics: Stats, domain knowledge, programming
- Studenst for Urban Data Systems(SUDS) clup at Carnegie Mellon Univesrity's Heinz College
- Columbia Data Science Program
- Western PA Regional Data Center
- wikipedia tech portal
This station set offers exposure-focused activities and challenges which each relate to some facet of the emerging field of Data Science.
Station 1: People in data tables
- Get to know one another by creating a data schema for data about the individuals in our class. Be sure to delimit the data types that each column can contain. Your choices for data types depend on the language or package you're using. For the purposes of this exercise, use whatever language you're most familiar with.
Station 2: Trends and predictions
Many businesses and organizations must make decisions based on predictions of conditions in the future. Anticipating cost changes over time of various inputs for a system influences planning efforts economy wide.
- Download this spreadsheet of historic data of college prices. Review the source and vet it for reliabilty.
- If a foundation is planning scholarship investments, how many more times expensive will private, four-year institutions be compared to public, two-year institutions 15 years from now?
- State a level of confidence for your estimate. Qualify your estimate based on outside factors whose changes may impact your recommendations.
- Print out or provide any backup info to justify your assertion.
Station 3: Domain knowledge
A chunk of data analysis competencies are rooted in command of field- or subject-specific knowledge. For example, Predicting changes in bus ridership due to economic shifts are made best by those who know transportation the best.
- Visit this site exploring data in crime prediction algorithms. Summarize the findings from this study. Take moment to discuss with your group what data science has to contribute to this discussion.
- Compare this data-related study to this one, by Mark Egge, who analyzed bus bunching patterns in Pittsburgh. What did Mark's study find?
- How are the applications of data science different in these two studies? What standards of data reliability or integrity exist? How are they different?
Station 4: Gapminder Web!
In this station, you'll have the chance to explore a data visualization tool that has shaped the development of many online data visualization tools.
- Visit the GapMinder web app. Devote about 5 minutes to just exploring the interface, tinkering with trends through time, etc.
- The default chart shows income versus life expectancy by country. View the timeline video since 1800 with your team. Make a list of the factors that contribute to the major shifts and jumps seen by various country groups in this visualization.
- Prioritize with your group the three factors which make this tool one of the most successful visualization endeavors of all of the internet.
- Visualization babies per woman versus Income on the X axis. Discuss with your group the trends since 1800. What outliners exist? Why do they exist? How could the data be misinterpreted?
- Explore the several other visualization modes other than "bubbles" such as "maps" and "trends". Which mode is the least intuitive for the lay user? Most intuitive?
Station 5: Final project review
- All of the pilot course's final projects are uploaded to our shared directory. Find this shared directory by navigating back to technologyrediscovery.net >> DAT-102 >> SP18 Final Project Shared Directory
- Some students chose more ambitious final projects than others. Dedicate a few minutes to reviewing at least 3 of the projects. Create a Plus-Delta table for the projects with your group: What did you think the students did well, overall? What areas of improvement could be made in their efforts?
- Discuss with your group any burning ideas you have for final projects in DAT-102!
straightenNote card diagnostic
Front: Which of today's station's contained content you felt a) most comfortable with and b) most excited to learn more about
Back: Which of today's stations are you least familiar with? Are there any you are nervous about learning?
arrow_upwardback to schedule