Support

Importing Data with RStudio

Follow

Introduction

Importing data into R is a necessary step that, at times, can become time intensive. To ease this task, RStudio includes new features to import data from: csv, xls, xlsx, sav, dta, por, sas and stata files. 

data-import-rstudio-overview.gif

Importing data

The data import features can be accessed from the environment pane or from the tools menu. The importers are grouped into 3 categories: Text data, Excel data and statistical data. To access this feature, use the "Import Dataset" dropdown from the "Environment" pane:

Screen_Shot_2018-10-31_at_9.24.22_PM.png

Or through the "File" menu, followed by the "Import Dataset" submenu:

Screen_Shot_2018-10-31_at_9.28.55_PM.png

Importing data from Text and CSV files

Importing "From Text (readr)" files allows you to import CSV files and in general, character delimited files using the readr package. This Text importer provides support to:

  • Import from the file system or a url
  • Change column data types
  • Skip or include-only columns
  • Rename the data set
  • Skip the first N rows
  • Use the header row for column names
  • Trim spaces in names
  • Change the column delimiter
  • Encoding selection
  • Select quote, escape, comment and NA identifiers

For example, one can import with ease a csv form data.gov by pasting this url https://data.montgomerycountymd.gov/api/views/2qd6-mr43/rows.csv?accessType=DOWNLOAD and selecting "Import".

Screen_Shot_2018-10-31_at_9.39.02_PM.png

Importing data from Text files

Importing using "From Text (base)" enables importing text files using the base package, this is helpful to preserve compatibility with previous versions of RStudio.

Screen_Shot_2018-10-31_at_9.33.14_PM.png

Importing data from Excel files

The Excel importer provides support to:

  • Import from the file system or a url
  • Change column data types
  • Skip columns
  • Rename the data set
  • Select an specific Excel sheet
  • Skip the first N rows
  • Select NA identifiers

For example, one can import with ease an xls file from data.gov by pasting this url http://www.fns.usda.gov/sites/default/files/pd/slsummar.xls and selecting "Update".

Notice that this file contains to tables and therefore, requires the first few rows to be removed.

Screen_Shot_2018-10-31_at_9.41.09_PM.png

We can clean this up by skipping 6 rows from this file and unchecking the "First Row as Names" checkbox.

Screen_Shot_2018-10-31_at_9.41.28_PM.png 

The file is looking better but some columns are being displayed as strings when they are clearly numerical data. We can fix this by selecting "numeric" from the column dropdown.

Screen_Shot_2018-10-31_at_9.42.13_PM.png

The final step is to click "Import" to run the code under "Code Preview" and import the data into RStudio, the final result should look as follows:

Screen_Shot_2018-10-31_at_9.44.21_PM.png

Importing data from SPSS, SAS and Stata files

The SPSS, SAS and Stata importer provides support to:

  • Import from the file system or a url
  • Rename the data set
  • Specify a model file

We can import https://github.com/rstudio/webinars/raw/master/23-Importing-Data-into-R/data/Child_Data.sav by pasting the address under File/Url and clicking "Update" followed by clicking "Import".

Screen_Shot_2018-10-31_at_9.48.39_PM.png

Need Help?

RStudio Pro customers may open a discussion with RStudio Support at any time.

You may also ask for help from R and RStudio users on community.rstudio.com. Be sure to include a reproducible example of your issue. Click here to start a new community discussion.

 

Comments

  • Avatar
    Philipp Wickey

    Hello. Having issue with header. My data is in csv form. I am looking at animals. Does R and Rstudio allow only 1 row as a header? My data has 3 rows as a header. For example in row 1 I have a family name, in row 2 I have a genus-species name, and row three I have a function name. Will that be an issue?

  • Avatar
    Scott Overmyer

    @Javier, I resolved this issue by starting over and installing R, then R Studio 1.1.383 on the Ubuntu box. When everything was updated in turn, the issue seems to have gone away. I must have not been doing something in the correct order.

  • Avatar
    Javier Luraschi

    Hi @Phillip, currently multiple header files is unsupported, I've added a feature request in readr to support this in the future: https://github.com/tidyverse/readr/issues/742

    @Scoot, I'm glad! Thanks for the share of your finding in this post.

  • Avatar
    Lakis

    Hi there. Yesterday I started learning R using Rstudio. I imported a quarterly dataset from excel in a dataframe, having in the first column (DATE) dates as 1947:Q1, 1947:Q2, etc.. and in the second column (Prices) quarterly prices using numbers. When I try to plot them I use: plot(US_rGDP_vintage$DATE, US_rGDP_vintage$Prices) I get an error: Error in plot.window(...) : need finite 'ylim' values.
    What is the problem here? Does R recognize the dates as I have imported them?

  • Avatar
    Premal Sheth

    Hi I am trying to do text mining . when i importing files in Rstudio using VCorpus command all apostrophe(') comes like ’.
    from some googling i found its encoding problem but how can i resolve it
    I am already using United state English.

    Please let me know
    Thanks in Advance

  • Avatar
    Sidra

    hi there, i am working with RStudio version 3.2.5. I have dataset in dta format. but i cant open it in R ? why its happening.

  • Avatar
    Duy Nguyen

    Hey there, I tried to import some data in SAS format, but keep getting an error message.
    1) Downloaded "DS1: Collaborative Psychiatric Epidemiology Surveys (CPES), 2001-2003" in SAS format from this website where: http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/20240.

    2) Clicked IMPORT DATASET => format SAS=> tried importing .sas Data files from that survey.

    3) Keep getting "Error: Is this valid SAS, SPSS, STATA?"

    Thanks!
    Duey

     

    Edited by Duy Nguyen
  • Avatar
    Javier Luraschi

    @Duy, I opened this issue for you under the 'haven' package to investigate further: https://github.com/tidyverse/haven/issues/330

    @Sidra Can you try importing this file with the 'haven' package and if the import fails, please open a github issue under https://github.com/tidyverse/haven/

  • Avatar
    Duy Nguyen

    So this works. See below:

    Hi,

    I would suggest downloading the SPSS version of the data and use the
    read.spss function to import into R.
    Here is the syntax

    library(foreign)
    imported_data = read.spss("SPSS_FILENAME.sav", to.data.frame = TRUE)

    There will be some undeclared levels warning due to some values having
    labels and some not having labels in the same variables.

    Yours

    Kilsang
    ICPSR at University of Michigan, Ann Arbor

  • Avatar
    Jia Duan

    Hi, I'm using Version 1.1.423 for Mac, but there is not option for CSV file. I have Text (base), Text (readr), Excel, SPSS, SAS, and Stata. Please help. Thanks!

  • Avatar
    Javier Luraschi

    @Jia Duan, use "From Text (readr)" or "From Text (base)..." to import data from CSVs.