Data comes in a variety of formats. When we think about data we typically think about structured data, for example data that can be imported nicely into Excel or SPPS. There are a number of alternative data formats that come in a variety of data types from many different data sources. Today, I will demonstrate these data formats using existing sources. We begin to learn how to analyze the data using innovative data science packages. I have chosen to concentrate on available software that is easy to use, and can facilitate the interpretation, analysis and visualization of data beyond the typical tools (e.g., SPSS). I will note the benefits and drawbacks for each package we review.
Note: Throughout the week, I will provide examples from my own work to illustrate how I have applied these tools. This includes a couple of web scraping applications (one which scrapes data from the Los Angeles Homicide Report and one to automate downloading data from the U.S. Census Bureau's Household Pulse Survey The goal is to show you what can be done, with the disclaimer that you will probably not be able to do this by the end of the week. Learning about data is a long-term goal but I am here to help you if at some point you want to implement any of this for your own work.
Show me the data!
There is tons of data available to download on the internet from reliable sources. I am going to review the datasets that I find most interesting and useful for social work. We will select a few datasets to analyze, visualize, and map. This requires knowing how to download and merge the data for analysis. These datasets provide a rich source of contextual information for your research, including your dissertation. As you will see, knowing about them, how to use them, and how to incorporate them into your research, is extremely powerful.
And, I created this website using Obsidian - Sharpen your thinking You may want to check it out. Obsidian is a note-taking app that is based on markdown files. It has many useful plugins such as Zotero | Your personal research assistant which if you are not currently using as your bibliography software I encourage you to download it ASAP. Ask me if you want me to demonstrate how to use Zotero and/or Obsidian now. See the video below for some tutorials and information to help you get started.
π― Learning Objectives for Day 1
Become familiar with open source software to use for different types of statistical analyses
Download and install jamovi and JASP onto your laptop
Learn how to install add-ons for SPSS, jamovi and JASP software
Run simple descriptive statistics and create APA formatted tables
Learn about the types of datasets (repositories and publicly available data) that are of great interest to social work
π Summary of Notes
Day 1: Morning Session
Open Source Software
There are a wide variety of different open source and proprietary software tools that are simple to learn and make data management and analysis easier, more effective and fun. Here is my list of the most important tools to be familiar with. The items in bold are the software that we will use during the course of the week.
The software is listed by availability (i.e., open source, license, package add-on)
Open Source Software
R: IMHO the best statistical analysis tool ever created (yes, better than python for statistics)
I will not be covering R but both R and RStudio (the GUI for R) can be downloaded here RStudio Desktop - Posit
jamovi: the next best option, built on an R framework, nobody will know you are NOT using R :)
jamovi is a new β3rd generationβ statistical spreadsheet. designed from the ground up to be easy to use, jamovi is a compelling alternative to costly statistical products such as SPSS and SAS. It is also a simpler alternative to R that is built on top of the R statistical language
π Analyses: jamovi provides a complete suite of analyses for (not just) the social sciences; t-tests, ANOVAs, correlation and regression, non-parametric tests, contingency tables, reliability and factor analysis. The jamovi library contains many more libraries that will allow you to perform additional analyses from simple crosstabs to multilevel modeling and more.
πExcel integration: jamovi is a fully functional spreadsheet, immediately familiar to anyone. Enter, copy/paste data, filter rows, compute new values, perform transforms across many columns at once β jamovi provides a streamlined spreadsheet experience, optimized for statistical data.
π R syntax: Love R? Check out jamovi's βsyntax modeβ, where the underlying R syntax for each analysis is made available. Just copy and paste this into R for a seamless transition. Alternatively, run R code directly inside jamovi with the Rj Editor.
Add-on Packages & Modules offer you additional functionality and are available for all software programs listed above
Day 1: Afternoon Session
Intro to data & data types:
Before starting an analysis it is important to know the kind of data that you have. This requires understanding not only the distinction between qualitative and quantitative data, but also the difference between structured and unstructured data, time series data, spatial data, etc. We wills start by briefly reviewing the different types of data that are available and the multiple ways in which data can be measured. This is of upmost importance because the analysis you choose is driven by the type of data you have. In other words, you cannot make your data fit the analysis you know, but rather you must learn how to analyze the data you have.
Qualitative v Quantitative
Structured v unstructured data
Data types
Data innovation - being creating with data: here we address the question: what really constitutes data? Over the years, I have learned that even advanced students do not really know how to properly identify data that can be used in an analysis. How can we identify, and more importantly utilize, different data sources in our analysis.
In addition, once jamovi is installed, there is a 'data library' available to you. This data library has worked examples for all modules that you install.