The RStudio IDE includes a data viewer that allows you to look inside data frames and other rectangular data structures. The viewer also allows includes some simple exploratory data analysis (EDA) features that can help you understand the data as you manipulate it with R.
Starting the viewer
You can invoke the viewer in a console by calling the
View function on the data frame you want to look at. For instance, to view the built-in
iris dataset, run these commands:
> data(iris) > View(iris)
You can also start the viewer by clicking on the table data icon on the right, in the environment pane:
As you might expect, you can sort by any column by just by clicking on the column. Click on a column that’s already sorted to reverse the sort direction.
To remove sorting and show the data in the order R sees it, click the empty cell in the upper left.
To apply filters, click the Filter icon in the toolbar. Any field that can be filtered will have a white box labeled All. Click this box to change which field values you want to see. For instance, to filter out irises with a sepal width greater than 3.6:
Note the text on the bottom, which indicates how many records the dataset contained before and after filtering; in this case we’ve filtered 150 records down to 135.
Not all kinds of fields can be filtered. At the moment, only the following types are supported:
- Factor (treated as character if > 256 levels)
- Boolean (logical)
Filters are additive (i.e. joined with AND); that is, if you apply two column filters, you will see only records that match both of them.
Clear individual filters by clicking the (x) next to the filter; to clear all the filters at once, click the Filter icon in the toolbar.
You can search for text across all the columns of your frame by typing in the global filter box:
The search feature matches the literal text you type in with the displayed values, so in addition to searching for text in character fields, you can search for e.g.
4.6 and see results in logical and numeric field types.
Searching and filtering are additive; when both are applied, you will see only records that match your filters and contain your search text.
In most cases the viewer will automatically refresh itself if it detects that the underlying data has changed. For instance, try this:
> data(Orange) > View(Orange) > Orange[1, "age"] <- 120
You’ll see the age of the first tree change from 118 to 120 in the viewer.
This auto-refreshing feature has some prerequisites, so if it doesn’t seem to be working:
You must call
View() on a variable directly. If, for instance, you call
View(rbind(foo, bar)), you’re invoking
View() on a new object created by evaluating your expression, and while that object contains data, it’s just a copy and won’t update when
The variable must be on an environment in the search path, ideally in the global environment.
Auto-refreshing works even when the data viewer is popped out into its own window, so this is a good way to take advantage of a multi-monitor setup!
The viewer supports column labels, such as those attached by the
Hmisc package and by SPSS import from
haven and others. Try this:
> library(Hmisc) > data(women) > label(women[]) <- "Woman's Height" > label(women[]) <- "Woman's Weight" > View(women)
label attribute on individual columns and the
variable.labels attribute on the outer frame are supported.
Restrictions and Performance
While rows are unbounded, columns are capped at 100. It’s not currently possible to virtualize columns in the same way as rows, and large numbers of columns cause the interface to slow significantly.
Finally, while we’ve made every effort to keep things speedy, very large amounts of data may cause sluggishness, especially when a sort or filter is applied, as this requires R to fully scan the frame. If you’re working with large frames, try applying filters to reduce it to the subset you’re interested in to improve performance.
At this time it’s not possible to extract the “current view” as an R object, or to save the manipulations therein as an R script; the data viewer is a feature designed to help you during exploratory data analysis and does not aim to produce a reproducible transformation.
In a future release, we may add the ability to export the transformations as e.g. a