ONLINE - Using R for data analysis October 2021 - II
- Start date:
- 12 October 2021
- Duration:
- 4 days
- Intended Audience:
- PhD
Introduction
What we teach: R is an open-source, free environment/language for statistical computing and graphics. It provides a large repository of statistical analysis methods.
The goal of the course is to teach students how the R language, extended by tidyverse package, can be used to build a report with a simple statistical analysis of data provided in a table. The course assumes no prior programming knowledge. This is not a statistics course! Elementary statistics knowledge is necessary to understand examples.
After the course you will be able to:
- understand and write (tidyverse-based) R code;
- know where to look for R methods to perform statistical analyses of your own data;
- generate reproducible reports from your own data in HTML, PDF or DOC formats.
The following topics will be covered:
- R expressions.
- R data objects: vectors, data frames (tibbles), lists.
- R Markdown for building reproducible reports.
- Data manipulation: filtering, sorting, summarising of a table; joining/merging multiple tables (with tidyverse/dplyr and tidyverse/tidyr).
- Visualisation: scatter plots, histograms, boxplots (with tidyverse/ggplot2).
- R packages: installation and usage.
Course material / Course structure
All study materials are supplied electronically only. The material will be covered in lectures and practical sessions.
The course is divided in 8 half-day sessions. The teachers are available for chatting online during all teaching slots. At the beginning of the week before the course the teachers might be contacted to help with R and RStudio installation problems (see below).
THE SESSIONS #1-#7: PRACTICING
The course is given online in a plenary setting. Each session is split into a few small topics, each introduced as follows:
- a short online (live video) lecture for introduction/demonstration;
- a self-study practice session (with primary exercises end extra exercises) with chat interactions;
- (when needed) a short chat question and live video answers session.
The students are asked to type the commands being presented and observe effects (avoid copy-paste; own typing is important in order to learn how to respond to mistakes/errors).
THE SESSION #8: SELF STUDY ASSIGNMENT
A self study assignment (SSA) will be offered during the last (#8) session. A set of tasks will be provided to be solved within 2 hours. At the end there will be a general discussion with chat questions and live video answers.
The overall SSA goal will be to prepare an R Markdown document reporting an analysis of a dataset. You will carry out the following steps:
- create a RStudio project, knit a R Markdown document;
- read a table which will be provided;
- show some filtered/summarized content of the table;
- produce some plots of the data.
Prerequisites
Participants must be able to use a laptop/computer capable of running recent RStudio. See below for the RStudio Installation section.
Moreover, Zoom Client needs to be installed (please do not join in a browser window; we observe problems then).
Installation
Installation of R and RStudio software, including additional packages, is required before the start of the course. Please follow the instructions below.
NOTE: Resolving installation problems during the course may be impossible, therefore please follow the steps below a week before the start of the course. In case of failure, please inform the teachers. In some situations intervention of the administrator of your computer might be necessary.
- Install R: go to the R Project for Statistical Computing (https://www.r-project.org/) and follow the download and installation instructions.
- Install RStudio: go to the RStudio download page (https://www.rstudio.com/products/rstudio/download/#download), select a version of RStudio appropriate for your laptop, download it and then install. Please check whether you can start RStudio.
Some additional packages are needed for the course. During the course the participants will learn how to install packages but this process occasionally fails (because e.g.: additional steps are needed in a particular operating system, or there is lack of permissions to access some system directories, or other software is too old, …).
- Install tidyverse package: Start RStudio. Go to menu Tools/Install Packages... In the field Packages select tidyverse. Press Install. (Now, a lot of messages will be shown in the Console window - wait till it finishes). In the Console window type library( tidyverse ) and press Enter. Some messages might be displayed but when there is no error the installation is completed.
- Install packages needed for R Markdown: Start RStudio. Go to menu File/New File/R markdown.... A New R Markdown window is displayed. Press OK. Now, in case of missing R Markdown packages, you will be asked to install them. Finally, you will see an editor window with Untitled1 header. Put the cursor in that window, then click Knit. Some messages might be displayed but when later a window with some text and a plot is shown the installation is completed.
During the course applications in multiple windows will be used. For better experience we advice a setup with two monitors (e.g. laptop and an external monitor).
Certificate of Attendance
In order to obtain a proof of participation, participation in all lectures and practical sessions is required. If you have participated in the full course, you will receive a certificate of attendance within two weeks after the course.
Language
Course material and lectures are in English.
Target group
Master and PhD students in the bio-medical sciences.
Organizing committee / Teachers
- Dr. S.M. Kielbasa (S.M.Kielbasa@lumc.nl)
- Drs. R. Monajemi (R.Monajemi@lumc.nl)
TUESDAY 12 OCTOBER 2021 | |
08:45 | Online registration |
09:00 | Lectures (slot #1 - R and RStudio basics) Dr. Szymon Kielbasa, Drs. Ramin Monajemi |
12:30 | Break |
13:30 | Lectures (slot #2 - Data structures 1/2) Dr. Szymon Kielbasa, Drs. Ramin Monajemi |
17:00 | End of day 1 |
WEDNESDAY 13 OCTOBER 2021 | |
08:45 | Online registration |
09:00 | Lectures (slot #3 - Data manipulation 1/3) Dr. Szymon Kielbasa, Drs. Ramin Monajemi |
12:30 | Break |
13:30 | Lectures (slot #4 - Data manipulation 2/3) Dr. Szymon Kielbasa, Drs. Ramin Monajemi |
17:00 | End of day 2 |
THURSDAY 14 OCTOBER 2021 | |
08:45 | Online registration |
09:00 | Lectures (slot #5 - Data structures 2/2) Dr. Szymon Kielbasa, Drs. Ramin Monajemi |
12:30 | Break |
13:30 | Lectures (slot #6 - Graphics) Dr. Szymon Kielbasa, Drs. Ramin Monajemi |
17:00 | End of day 3 |
FRIDAY 15 OCTOBER 2021 | |
08:45 | Online registration |
09:00 | Lectures (slot #7 - Data manipulation 3/3 and functions) Dr. Szymon Kielbasa, Drs. Ramin Monajemi |
12:30 | Break |
13:30 | Self-study assignment (slot #8) Dr. Szymon Kielbasa, Drs. Ramin Monajemi |
17:00 | End of day 4 |
ONLINE
Please note, this course will only take place when the minimum number of participants is reached. Therefor you will pre-register for this course. You will receive a message and the invoice as soon as the course will proceed. With this registration method, we prevent unnecessary payment transactions in the event of cancellation of the course
Regular course fee | € 450,- |
Reduced fee for PhD students LUMC | € 150,- |
Reduced fee for employees LUMC | € 150,- |
BA/MA students of the Leiden University | Free of charge * |
BA/MA students of other universities (non Leiden University) | € 75,- * |
* Limited places available. In order to validate your student registration, you must register with your student e-mail address and submit your student number on the registration form. In addition, a scan of your student pass will have to be submitted to boerhaavenascholing@lumc.nl. Please note that a € 45,- cancellation fee will be charged to students who do not attend the course (no show), or cancel their registration.