Read data from an Excel file or Workbook object into a data.frame
read.xlsx(
xlsxFile,
sheet,
startRow = 1,
colNames = TRUE,
rowNames = FALSE,
detectDates = FALSE,
skipEmptyRows = TRUE,
skipEmptyCols = TRUE,
rows = NULL,
cols = NULL,
check.names = FALSE,
sep.names = ".",
namedRegion = NULL,
na.strings = "NA",
fillMergedCells = FALSE
)
An xlsx file, Workbook object or URL to xlsx file.
The name or index of the sheet to read data from.
first row to begin looking for data. Empty rows at the top of a file are always skipped, regardless of the value of startRow.
If TRUE
, the first row of data will be used as column names.
If TRUE
, first column of data will be used as row names.
If TRUE
, attempt to recognise dates and perform conversion.
If TRUE
, empty rows are skipped else empty rows after the first row containing data
will return a row of NAs.
If TRUE
, empty columns are skipped.
A numeric vector specifying which rows in the Excel file to read. If NULL, all rows are read.
A numeric vector specifying which columns in the Excel file to read. If NULL, all columns are read.
logical. If TRUE then the names of the variables in the data frame are checked to ensure that they are syntactically valid variable names
One character which substitutes blanks in column names. By default, "."
A named region in the Workbook. If not NULL startRow, rows and cols parameters are ignored.
A character vector of strings which are to be interpreted as NA. Blank cells will be returned as NA.
If TRUE, the value in a merged cell is given to all cells within the merge.
data.frame
Formulae written using writeFormula to a Workbook object will not get picked up by read.xlsx(). This is because only the formula is written and left to be evaluated when the file is opened in Excel. Opening, saving and closing the file with Excel will resolve this.
xlsxFile <- system.file("extdata", "readTest.xlsx", package = "openxlsx")
df1 <- read.xlsx(xlsxFile = xlsxFile, sheet = 1, skipEmptyRows = FALSE)
sapply(df1, class)
#> Var1 Var2 Var3 Var4 Var5 Var6
#> "logical" "numeric" "numeric" "character" "numeric" "character"
#> Var7
#> "numeric"
df2 <- read.xlsx(xlsxFile = xlsxFile, sheet = 3, skipEmptyRows = TRUE)
df2$Date <- convertToDate(df2$Date)
sapply(df2, class)
#> Date value word bool wordZ2
#> "Date" "numeric" "character" "logical" "character"
head(df2)
#> Date value word bool wordZ2
#> 1 2014-04-28 0.8390764 N-U-B-R-A FALSE FALSE-Z
#> 2 2014-04-27 0.8863800 N-Z-P-S-Y TRUE TRUE-Z
#> 3 2014-04-26 0.5741314 C-G-D-X-H TRUE TRUE-Z
#> 4 2014-04-25 0.1366065 <NA> FALSE FALSE-Z
#> 5 2014-04-24 0.3692582 B-K-A-O-W TRUE TRUE-Z
#> 6 2014-04-23 NA H-P-G-O-K TRUE TRUE-Z
df2 <- read.xlsx(
xlsxFile = xlsxFile, sheet = 3, skipEmptyRows = TRUE,
detectDates = TRUE
)
sapply(df2, class)
#> Date value word bool wordZ2
#> "Date" "numeric" "character" "logical" "character"
head(df2)
#> Date value word bool wordZ2
#> 1 2014-04-28 0.8390764 N-U-B-R-A FALSE FALSE-Z
#> 2 2014-04-27 0.8863800 N-Z-P-S-Y TRUE TRUE-Z
#> 3 2014-04-26 0.5741314 C-G-D-X-H TRUE TRUE-Z
#> 4 2014-04-25 0.1366065 <NA> FALSE FALSE-Z
#> 5 2014-04-24 0.3692582 B-K-A-O-W TRUE TRUE-Z
#> 6 2014-04-23 NA H-P-G-O-K TRUE TRUE-Z
wb <- loadWorkbook(system.file("extdata", "readTest.xlsx", package = "openxlsx"))
df3 <- read.xlsx(wb, sheet = 2, skipEmptyRows = FALSE, colNames = TRUE)
df4 <- read.xlsx(xlsxFile, sheet = 2, skipEmptyRows = FALSE, colNames = TRUE)
all.equal(df3, df4)
#> [1] TRUE
wb <- loadWorkbook(system.file("extdata", "readTest.xlsx", package = "openxlsx"))
df3 <- read.xlsx(wb,
sheet = 2, skipEmptyRows = FALSE,
cols = c(1, 4), rows = c(1, 3, 4)
)
## URL
##
if (FALSE) { # \dontrun{
xlsxFile <- "https://github.com/awalker89/openxlsx/raw/master/inst/readTest.xlsx"
head(read.xlsx(xlsxFile))
} # }