Lab1-System Setup Tutorial
Welcome to SBU SOC591 Intro to CSS Lab training session. You can get access to this tutorial via https://yongjunzhang.com/files/css/Lab1-Tutorial.html. You get the RMarkdown file here.
This tutorial aims to help CSS students set up their computer system to meet necessary lab training requirements. It includes basic steps to install Git, Github, R, Rstudio, Python, Jupyter notebook, Spyder, Google Cloud service, vscode, etc. You can choose the appropriate ones they need.
If you don’t want to install anything on your computer, you can use rstudio.cloud (R) and google colab (python). But I strongly suggest that you install them because you want to have more experience.
Install Homebrew on Mac
Homebrew https://brew.sh/ is the missing package manager for macOS (or Linux). So it installs software you need that Apple (or your Linux system) didn’t have.
To install, paste the following code in the quote in a macOS Terminal or Linux shell prompt. If you don’t know terminal, you can do spotlight search and type terminal to open it…
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"
You can check here https://brew.sh/ for more details. If you use windows system, just skip this.
Then Use Homebrew to Install Git on Mac
Git is one of the most popular version control system and you can use it with your github.com repository as well as RStudio.
Copy and paste the following code into your terminal window and hit Return.
brew install git
Note that you can also use homebrew to install other software you need using similar commands. For more details on brew commands, you can check here https://docs.brew.sh/Manpage.
If you use Windows System, you need to download Git first and then install Git on Windows
You can download GIT here https://gitforwindows.org/
After you have successfully installed GIT, you need to MAKE some basic configurations.
Here is the GIT cheat sheet https://github.github.com/training-kit/downloads/github-git-cheat-sheet/ Usually you need to set up your global name and email address (usually your github account).
Check here to sign up for a free github account https://github.com/join. Check here to see why you need GITHUB https://github.com/features.
Another way to use GIT is via GITHUB desktop.
You can check here https://docs.github.com/en/desktop/getting-started-with-github-desktop/setting-up-github-desktop to download and install GITHUB DESKTOP. But you have to set up your account and Install GIT first.
Install R and Rstudio
Of course, you can use homebrew to install R on MAC. But let us do it from scratch.
Before Installing Rstudio, you need to Install R. You can download R and other necessary software by clicking here https://cran.r-project.org/. You can choose the appropriate version for your system (e.g., windows, Mac). Be careful to follow its installing instruction, especially regarding those necessary software like xquartz.
After this step, please go to RStudio website to download and install RStudio desktop. You can click here https://rstudio.com/products/rstudio/download/ and choose the free version.
Then, you can use install.packages function to install necessary R packages. But I suggest you to copy and paste following code to install some common packages for data processing and visualization. You can add any packages you want to install by defining the “packages” variable.
if (!requireNamespace("pacman"))
install.packages('pacman')
library(pacman)
<-c("tidyverse",
packages"tidytext",
"rvest",
"RSelenium",
"stm",
"tm",
"tidytext",
"janitor",
"quanteda",
"topicmodels",
"caret",
"plotly",
"haven",
"ggrepel",
"readxl",
"lubridate",
"stringdist",
"tensorflow",
"keras",
"torch",
"torchvision")
p_load(packages,character.only = TRUE)
You can also click here to install tensorflow and keras https://tensorflow.rstudio.com/installation/. We will use it for image classification. Of course, you can install python version for keras and tensorflow…
Install Python
There are couple of ways to install python. For simplicity, we use anaconda to install python and its associated software.
You can click here to download and install anaconda https://www.anaconda.com/products/individual. Please choose the right version for your windows or mac system.
Please follow the instruction to install anaconda and spyder.
Once your python is installed, you can use conda or pip to install necessary modules. For instance, you can install nltk module using the following code:
pip install nltk
Note that if you use python 3+, you need to check whether it is pip3 or pip when install python modules.
Spyder is a shell program like Rstudio that helps efficiently use python. But you can also use jupyter notebook or google colab. I suggest that you should install jupyter notebook/or jupyter lab first (you don’t need to install spyder though).
You can use pip to install jupyter notebook. For more details, you can check here https://jupyter.org/install
open your terminal, and type the following code:
pip install jupyterlab
If you system does not install pip, you can check here for more details:https://pip.pypa.io/en/stable/installing/
Similarly, you can install modules like pandas, numpy, sci-learn, selenium,beautifulsoup, tensorflow, pytorch, keras, etc.
We will install other necessary packages as the semester progresses.
NOW,let’s move to Google Cloud
In order to use Google Cloud service, you have to get an account first. Use your personal Google Gmail account, instead of university email account, to login into cloud console. You can check here for more details https://cloud.google.com/gcp/. Once you activate your Google cloud service, you should be able to set up a project and payment method. Note that Google provides 300 credits for new users. That would be enough for our course.
Why we don’t use university email account? Sometimes it does not have access to some services. For instance, you cannot use google colab (similar to ipython notebook) when you use stony brook g-suit account.
You can also download and install Google SDK by checking here https://cloud.google.com/sdk. Here is a quick start for MAC: https://cloud.google.com/sdk/docs/quickstart-macos. Here is another quick start for Windows: https://cloud.google.com/sdk/docs/quickstart-windows
We are going to use Google bigquery, map, and NLP api.
Some Notes
Please be aware that even though you are asked to install python and relevant software, we will focus more on R language. I tend to use more R because most sociologists use R for their research. Of course, we will cover some basic python. If you want to be a good computational social scientist, I do think learning these two languages is necessary though you have to suffer the learning curve if you have zero experience.