Starting Where You Are – Tips for Your First Data Science Project (Online; Love Data Week)
Jamaica Jones, PhD Candidate, University of Pittsburgh, School of Computing and Information
& Ali Krzton, Research Data Management Librarian, Auburn University
For those of us who aren’t data scientists, “data science” can feel mysterious and out of reach. In reality, data science is like everything else – completed in a series of steps, inevitably beginning with the first. Nobody begins anything as an expert – this talk will provide some tools for starting where you are.
Led by two recent data science beginners, this talk will open with an introduction to getting to know your data. Understanding the structure and scope of your data is an essential first step in working with it successfully. We will use an open-source tool called Open Refine to review a data set and understand the many ways in which data can be messy.
Another important step in building data science skills is working from existing code. Far from being a lazy shortcut, code reuse is standard practice in data science. We will share some tips for getting the most out of code samples, including finding reusable code, interpreting functions and adapting existing code to the needs of your project.
Having recently completed data science fellowships, the two presenters each began their fellowship work as newcomers to data science. By the end, each had completed projects that advanced analysis across federal funding and research data management agencies. They did so by starting where they were, taking one step at a time and learning to ask for help when needed. Their talk will conclude with a review of easy-to-access resources available to data workers of all levels of ability.
Audience: General Pitt community
This workshop is part of Love Data Week! Check out the entire lineup at: https://lovedatapgh.io/.