R can import data from local storage or Internet, and export it locally. Let us first setup the current folder, where our results will be stored.
getwd() # shows current folder dir() # shows files in the current folder dir.create("D:/Data/R") # create a folder setwd("D:/Data/R") # sets the current folder
Here is EUR/USD ratio from January 1999 till April 2017.
We can read values from unformatted text file using
SomeData = scan("http://edu.modas.lu/data/txt/currency.txt",what = "") # what - defines the value class head(SomeData)
In fact, you can download an entire webpage by
parse it afterwards. It’s funny, but we need to get readable data.
We will use
read.table() to import the data as a data
Some parameters are important in
header- set it
TRUEif there is a header line
sep- separator character.
"\t"stands for tabulation
as.is- prevents transforming character columns to factors.
Currency = read.table("http://edu.modas.lu/data/txt/currency.txt", header=T, sep="\t", as.is=T) str(Currency)
Do not forget functions that allow you seeing, what is inside your data:
head(Currency) summary(Currency) View(Currency)
Let’s make the first plot.
Hmm… it’s quite ugly… We will improve it later.
R can keep data in GZip-ed form, automatically
loading the variables into memory. Such files have .RData extension.
This is a fast & easy way to store your data. Let us first download
the data in RData format into you working directory using
download.file() and then load it by
Parameters of downloading:
destfile- the file name, under which you would like to store the downloaded file.
mode- the way you would like to treat the data (as text or binary). To keep binary data unchanged, use
download.file("http://edu.modas.lu/data/rda/all.RData", destfile="all.RData",mode = "wb") getwd() # show current folder dir(pattern=".RData") # show files in the current folder load("all.RData") # load the data ls() # you should see 'GE.matrix' among variables View(GE.matrix)
You can see row and column names of the loaded data.frame object:
attr(GE.matrix,"dimnames") # annotation of the dimensions rownames(GE.matrix) colnames(GE.matrix)
R can read Excel files using one of
readxl. Install it and attach the library:
# install.packages("readxl") library(readxl)
read_excel() can only read from
folders, not from Internet! So, we will first download Excel file:
download.file("http://edu.modas.lu/data/xls/cancer.xlsx",destfile="cancer.xlsx",mode = "wb") getwd()
read_excel() can be used to read both “xls” and
“xlsx” files. Some parameters:
path- path and file name
sheet- either name of the sheet or its number
col_names- are there column names? (default = TRUE)
col_types- types of the columns. Automatically detected by default
It will read Excel file into a
tibble object -
tidyverse version of a data.frame. If you wish, you can
transforme it by
Cancer = read_excel("cancer.xlsx") str(Cancer) ## now Cancer is a 'tibble' - tidyverse object for data.frame ## if you prefer standard data.frame: Cancer = as.data.frame(Cancer) str(Cancer)
There are several ways to export your data. Let’s consider the most simple.
write()- writes a column of numbers / characters
write.table()- writes a data table
save()- saves one or several variables into a binary RData file.
eol- character for the end of line (can be differ with OS). The standard one is “”
dec- decimal separator
quote- do we put “” around character values or not
row.names- do we put row names as a column or not
write.table(Currency,file = "curr.txt",sep = "\t", eol = "\n", na = "NA", dec = ".", row.names = FALSE, quote=FALSE)
You can also save object in binary format (faster and smaller file):
save(Currency,file="Currency.RData") # save as binary file save(list=ls(),file="workspace.RData") # save all variables as binary file getwd() dir() # see the results
- Dataset from http://edu.modas.lu/data/txt/shop.txt contains records about customers, collected by a women’s apparel store. Check its structure. View its summary.
- For the “shop” table, save into a new text file only the records for customers, who paid using Visa card.
|Prev Home Next|