Start a conversation

How can I export and merge data in R?

Before reading this article, please make sure you have exported your data in CSV format. See also: Export data.

Export your data into R

Follow these steps to read your data into R. 

  1. Read a .csv file in R using the following code:
    studydata <- read.csv(file="C:\\yourfilepath\\exportstudydata.csv",header=TRUE, sep=",")

    Note: It is possible that the values in your .csv file are separated with semicolons rather than commas. You can check this by opening the .csv file with any text application (e.g. Excel or Notepad). If this is the case, replace the comma separator (sep=",") with a semicolon:

    studydata <- read.csv(file="C:\\yourfilepath\\exportstudydata.csv", header=TRUE, sep=";")
  2. In the code above, replace:
    C:\\yourfilepath\\exportstudydata.csv

    with your own file path.

  3. Replace single backslashes (\) in your file path with two backslashes (\\) as shown in the code above.
  4. Run the code to read your file into R. You have now created "studydata".
  5. View your data by running:
View(studydata)


Merge study and report data in R in long format

After you have exported your data, you can merge your data. Follow these steps if you want to merge your study and report data in a long format. This means that for every report instance, a new row is added for a record. 

Let's say you have imported your study data and report data into R, which are called "studydata" and "report". 

  1. Make the column name of the record ID identical, so that you can merge the data based on the  record ID:       
    colnames(studydata)[1] <- "recordID"
    colnames(report)[1] <- "recordID"
  2. Merge the files by running the following line of code:
    myfulldata = merge(studydata,report,by="recordID",all=TRUE)
  3. View your merged dataset by running:
    View(myfulldata)


Merge study and report data in R in wide format

After you have exported your data, you can merge your data. Follow these steps if you want to merge your study and report data in a wide format. This means that for every report instance, new columns are added.

Let's say you have imported your study data and report data into R, which are called "studydata" and "report". 

  1. Make the record ID column names identical so you can merge on recordID:

    colnames(report)[1]<-"recordID"
    colnames(studydata)[1]<-"recordID"
  2. We'll use the data.table package to cast the report data to a wide format (i.e., every report instance has its own columns). If you have never installed packages before, use: 

    install.packages("data.table")

    See also this tutorial on how to install packages in R. If you have done this (before), load the package: 

    library(data.table)
  3. Define which variables you want to include:

    1. All variables (i.e., from report creation date to the last variable in your report):

      allVariables <- colnames(report)[2:ncol(report)]
    2. Specific variables, based on the variable name:                
      allVariables <- c("Report.Creation.Date","Report.Name.Custom","Report.Parent", "yourVariable1", "yourVariable2","yourVariable3")
    3. A number of variables, based on their variable/column position. The record ID has position 1, record creation date has position 2, the custom report name has position 3, etc. So for example, if you want to include report creation date, custom report name, parent report, and your own first 2 variables, you can use:                
      allVariables <- colnames(report)[2:6]
  4. After defining the variables you want to include, cast your report data to a wide format with the code below. Every column name will have a suffix “_ReportX”, with X being the number of the report instance.    
    castedReport <- dcast(setDT(report), recordID~rowid(recordID, prefix="Report"), value.var=allVariables)
  5. View your castedReport with:         
    View(castedReport)
  6. Merge the casted report with the study data with: 

    myfulldata <- merge(studydata,castedReport,by="recordID",all=TRUE)
  7. View your merged dataset with: 

    View(myfulldata)


Choose files or drag and drop files
Was this article helpful?
Yes
No
  1. Castor Support Team

  2. Posted
  3. Updated