Day 1

Morning

πŸ”‘Key Takeaways for Day 1

What is data?

Data comes in a variety of formats. When we think about data we typically think about structured data, for example data that can be imported nicely into Excel or SPPS. There are a number of alternative data formats that come in a variety of data types from many different data sources. Today, I will demonstrate these data formats using existing sources. We begin to learn how to analyze the data using innovative data science packages. I have chosen to concentrate on available software that is easy to use, and can facilitate the interpretation, analysis and visualization of data beyond the typical tools (e.g., SPSS). I will note the benefits and drawbacks for each package we review.

Note: Throughout the week, I will provide examples from my own work to illustrate how I have applied these tools. This includes a couple of web scraping applications (one which scrapes data from the Los Angeles Homicide Report and one to automate downloading data from the U.S. Census Bureau's Household Pulse Survey The goal is to show you what can be done, with the disclaimer that you will probably not be able to do this by the end of the week. Learning about data is a long-term goal but I am here to help you if at some point you want to implement any of this for your own work.

Show me the data!

There is tons of data available to download on the internet from reliable sources. I am going to review the datasets that I find most interesting and useful for social work. We will select a few datasets to analyze, visualize, and map. This requires knowing how to download and merge the data for analysis. These datasets provide a rich source of contextual information for your research, including your dissertation. As you will see, knowing about them, how to use them, and how to incorporate them into your research, is extremely powerful.

And, I created this website using Obsidian - Sharpen your thinking You may want to check it out. Obsidian is a note-taking app that is based on markdown files. It has many useful plugins such as Zotero | Your personal research assistant which if you are not currently using as your bibliography software I encourage you to download it ASAP. Ask me if you want me to demonstrate how to use Zotero and/or Obsidian now. See the video below for some tutorials and information to help you get started.

🎯 Learning Objectives for Day 1

  • Become familiar with open source software to use for different types of statistical analyses
  • Download and install jamovi and JASP onto your laptop
  • Learn how to install add-ons for SPSS, jamovi and JASP software
  • Run simple descriptive statistics and create APA formatted tables
  • Learn about the types of datasets (repositories and publicly available data) that are of great interest to social work

πŸ“ƒ Summary of Notes

Day 1: Morning Session

Open Source Software

There are a wide variety of different open source and proprietary software tools that are simple to learn and make data management and analysis easier, more effective and fun. Here is my list of the most important tools to be familiar with. The items in bold are the software that we will use during the course of the week.

The software is listed by availability (i.e., open source, license, package add-on)

Open Source Software

  • R: IMHO the best statistical analysis tool ever created (yes, better than python for statistics)
  • jamovi: the next best option, built on an R framework, nobody will know you are NOT using R :)
    • jamovi is a new β€œ3rd generation” statistical spreadsheet. designed from the ground up to be easy to use, jamovi is a compelling alternative to costly statistical products such as SPSS and SAS. It is also a simpler alternative to R that is built on top of the R statistical language
    • 🌐 Analyses: jamovi provides a complete suite of analyses for (not just) the social sciences; t-tests, ANOVAs, correlation and regression, non-parametric tests, contingency tables, reliability and factor analysis. The jamovi library contains many more libraries that will allow you to perform additional analyses from simple crosstabs to multilevel modeling and more.
    • 🌐Excel integration: jamovi is a fully functional spreadsheet, immediately familiar to anyone. Enter, copy/paste data, filter rows, compute new values, perform transforms across many columns at once – jamovi provides a streamlined spreadsheet experience, optimized for statistical data.
    • 🌐 R syntax: Love R? Check out jamovi's β€œsyntax mode”, where the underlying R syntax for each analysis is made available. Just copy and paste this into R for a seamless transition. Alternatively, run R code directly inside jamovi with the Rj Editor.
    • There are two versions of jamovi
      • jamovi Cloud
      • jamovi desktop - jamovi -- downloads and installs jamovi onto your desktop
        • It is available for Windows, Mac and Linux
  • JASP - A Fresh Way to Do Statistics (jasp-stats.org)
    • "Just Another Statistics Program" JASP offers another great alternative to SPSS
    • In some ways JASP is better than jamovi, but it seems less stable and so it is my second best option
    • There are some benefits to using JASP including flexibility in making plots and nice visualizations for the statistical analyses you are conducting
    • Click here to download JASP
    • JASP in particular provides a great way to both learn statistics and R at the same time
    • Let's take a look at the data library in JASP now
  • QGIS: do geospatial analysis like a (ArcGIS) pro is for creating, editing, visualizing, analyzing and publishing geospatial information
    • "Quantum Geographical Information Systems"
    • The Applications (qgis.org) page gives you a sense of all the cool things you can do with QGIS
    • To download the software Click here)
  • PSPP - GNU Project - Free Software Foundation is an open source equivalent to SPSS and it is freely downloadable. It is not very robust and so I would not recommend it.

Proprietary software

  • OSU has obtained a license for the following software packages:
    • SPSS: its like a necessary evil - everyone must know how to use SPSS. There are some things that are actually easier in SPSS
    • SAS: ugh!
    • EXCEL: can be great to clean your data, particularly if you use the built-in functions.
    • ArcGIS Map, Pro
    • jmp: this is a very cool data science program from the makers of SAS (the archaic and soon to be extinct software program)
    • To download these software packages Log In to the Office of Business and Finance (osu.edu)

❓ Your turn

  • Download jamovi onto your laptop
  • Download JASP onto your laptop
  • Download QGIS onto your laptop

Add-on Packages

Day 1: Afternoon Session

Intro to data & data types:

Before starting an analysis it is important to know the kind of data that you have. This requires understanding not only the distinction between qualitative and quantitative data, but also the difference between structured and unstructured data, time series data, spatial data, etc. We wills start by briefly reviewing the different types of data that are available and the multiple ways in which data can be measured. This is of upmost importance because the analysis you choose is driven by the type of data you have. In other words, you cannot make your data fit the analysis you know, but rather you must learn how to analyze the data you have.


❓ Your turn

  • Identify a dataset that is not listed above
  • Find the codebook associated with the dataset
  • Download the data
    • What limitations are there for analyzing the data?
    • What types of research questions have these data been used to address?
    • How could you use the data in your own research?

πŸ“¦ Additional Resources