Support

Importing Data with RStudio

Follow

This feature is currently only available in RStudio Preview, 0.99.1130 or higher.

Introduction

Importing data into R is a necessary step that, at times, can become time intensive. To ease this task, RStudio includes new features to import data from: csv, xls, xlsx, sav, dta, por, sas and stata files. 

Importing data

The data import features can be accessed from the environment pane or from the tools menu. The importers are grouped into 3 categories: Delimited data, Excel data and statistical data. To access this feature, use the "Import Dataset" dropdown from the "Environment" pane:

Or through the "Tools" menu, followed by the "Import Dataset" submenu:

Importing data from CSV files

The CSV importer provides support to:

  • Import from the file system or a url
  • Change column data types
  • Skip or include-only columns
  • Rename the data set
  • Skip the first N rows
  • Use the header row for column names
  • Trim spaces in names
  • Change the column delimiter
  • Encoding selection
  • Select quote, escape, comment and NA identifiers

For example, one can import with ease a csv form data.gov by pasting this url https://data.montgomerycountymd.gov/api/views/6rqk-pdub/rows.csv?accessType=DOWNLOAD and selecting "import".

Importing data from Excel files

The Excel importer provides support to:

  • Import from the file system or a url
  • Change column data types
  • Skip columns
  • Rename the data set
  • Select an specific Excel sheet
  • Skip the first N rows
  • Select NA identifiers

For example, one can import with ease an xls file from data.gov by pasting this url http://www.fns.usda.gov/sites/default/files/pd/slsummar.xls and selecting "import".

Notice that this file contains to tables and therefore, requires the first few rows to be removed.

We can clean this up by skipping 6 rows from this file and unchecking the "First Row as Names" checkbox.

 

The file is looking better but some columns are being displayed as strings when they are clearly numerical data. We can fix this by selecting "numeric" from the column dropdown.

The final step is to click "import" to run the code under "Code Preview" and import the data into RStudio, the final result should look as follows:

Importing data from SPSS, SAS and Stata files

The SPSS, SAS and Stata importer provides support to:

  • Import from the file system or a url
  • Rename the data set
  • Specify a model file

 

Have more questions? Submit a request

Comments

  • Avatar
    attoumand

    I'm using RStudio Version 0.99.903. But i haven't see new features to import data from: csv, xls, xlsx, sav, dta, por, sas and stata files. I see only two options (local File and Web url).
    My OS is Windows 10 32bit.
    Please someone can help me?

  • Avatar
    Javier Luraschi

    0.99.903 does not yet contain this functionality, try installing a newer preview from here instead: https://www.rstudio.com/products/rstudio/download/preview/

  • Avatar
    mspinola10

    I am working with Rstudio preview 1.0.12 (Window 10, 64 bits)

    I am trying to read a txt file, but when I want to change on of my columns from character to factor, is asking me "Please enter the format string".
    What is that? and why is asking me that?

  • Avatar
    Javier Luraschi

    Thanks for the feedback, we are planning to improve this by asking for a comma separated list of factors.

    In the meantime, you can specify the factors as follows: c("factor1", "factor2", "factor3")

  • Avatar
    matjung

    I believe this function is available at rstudio-1.0.136 - Centos7 64 bit
    But: I get this message:
    Preparing data import requires an updated version of the readr package.

    Updateing the readr package fails.
    Based on the error messages, readr depends on curl
    For whatever reasons, StudioR does not find libcurl
    No package 'libcurl' found
    Package libcurl was not found in the pkg-config search path.
    Perhaps you should add the directory containing `libcurl.pc'
    to the PKG_CONFIG_PATH environment variable
    No package 'libcurl' found
    How can I fix that?

  • Avatar
    Camilla L. Nesbo

    I recently upgraded my R studio and am now having issues with set.names.
    I used to use
    FileT = setNames(data.frame(t(File[,-1])), File[,1])
    To put the column names in the File to be the row names in the transposed FileT.

    Now it just puts all the names into the first cell of the data frame....
    Anyone know what I can do to fix this?

  • Avatar
    Javier Luraschi

    @Matjung: See, https://github.com/jeroen/curl your probably want to install curl as `sudo yum install libcurl-devel` for Centos7.

  • Avatar
    Javier Luraschi

    @Camilia: I'm not aware of any changes in setNames. I would suggest opening a new question in our support forum to have some of my colleagues help you out.

Powered by Zendesk