R can import data from local storage or Internet, and export it locally. Let us first setup the current folder, where our results will be stored.
getwd() # shows current folder
dir() # shows files in the current folder
dir.create("D:/Data/R") # create a folder
setwd("D:/Data/R") # sets the current folder
Here is EUR/USD ratio from January 1999 till April 2017.
We can read values from unformatted text file using
scan()
.
SomeData = scan("http://edu.modas.lu/data/txt/currency.txt",what = "") # what - defines the value class
head(SomeData)
In fact, you can download an entire webpage by scan
to
parse it afterwards. It’s funny, but we need to get readable data.
We will use read.table()
to import the data as a data
frame.
Date | EUR |
---|---|
1999-01-04 | 1.1867 |
1999-01-05 | 1.1760 |
1999-01-06 | 1.1629 |
1999-01-07 | 1.1681 |
1999-01-08 | 1.1558 |
Some parameters are important in read.table()
:
header
- set it TRUE
if there is a header
linesep
- separator character. "\t"
stands for
tabulationas.is
- prevents transforming character columns to
factors.Currency = read.table("http://edu.modas.lu/data/txt/currency.txt", header=T, sep="\t", as.is=T)
str(Currency)
Do not forget functions that allow you seeing, what is inside your data:
head(Currency)
summary(Currency)
View(Currency)
Let’s make the first plot.
plot(Currency$EUR)
Hmm… it’s quite ugly… We will improve it later.
R can keep data in GZip-ed form, automatically
loading the variables into memory. Such files have .RData extension.
This is a fast & easy way to store your data. Let us first download
the data in RData format into you working directory using
download.file()
and then load it by load()
.
Parameters of downloading:
destfile
- the file name, under which you would like to
store the downloaded file.mode
- the way you would like to treat the data (as
text or binary). To keep binary data unchanged, use
wb
!download.file("http://edu.modas.lu/data/rda/all.RData",
destfile="all.RData",mode = "wb")
getwd() # show current folder
dir(pattern=".RData") # show files in the current folder
load("all.RData") # load the data
ls() # you should see 'GE.matrix' among variables
View(GE.matrix)
You can see row and column names of the loaded data.frame object:
attr(GE.matrix,"dimnames") # annotation of the dimensions
rownames(GE.matrix)
colnames(GE.matrix)
ToDo: RDS file
R can read Excel files using one of tidyverse
packages:
readxl
. Install it and attach the library:
# install.packages("readxl")
library(readxl)
Note: read_excel()
can only read from
folders, not from Internet! So, we will first download Excel file:
download.file("http://edu.modas.lu/data/xls/cancer.xlsx",destfile="cancer.xlsx",mode = "wb")
getwd()
Function read_excel()
can be used to read both “xls” and
“xlsx” files. Some parameters:
path
- path and file namesheet
- either name of the sheet or its numbercol_names
- are there column names? (default =
TRUE)col_types
- types of the columns. Automatically
detected by defaultIt will read Excel file into a tibble
object -
tidyverse
version of a data.frame. If you wish, you can
transforme it by as.data.frame()
function.
Cancer = read_excel("cancer.xlsx")
str(Cancer)
## now Cancer is a 'tibble' - tidyverse object for data.frame
## if you prefer standard data.frame:
Cancer = as.data.frame(Cancer)
str(Cancer)
There are several ways to export your data. Let’s consider the most simple.
write()
- writes a column of numbers / characterswrite.table()
- writes a data tablesave()
- saves one or several variables into a binary
RData file.Parameters of write.table
are:
eol
- character for the end of line (can be differ with
OS). The standard one is “”dec
- decimal separatorquote
- do we put “” around character values or
notrow.names
- do we put row names as a column or notwrite.table(Currency,file = "curr.txt",sep = "\t",
eol = "\n", na = "NA", dec = ".",
row.names = FALSE, quote=FALSE)
You can also save object in binary format (faster and smaller file):
save(Currency,file="Currency.RData") # save as binary file
save(list=ls(),file="workspace.RData") # save all variables as binary file
getwd()
dir() # see the results
- Dataset from http://edu.modas.lu/data/txt/shop.txt contains records about customers, collected by a women’s apparel store. Check its structure. View its summary.
read.table
,View
,str
,summary
,head
- For the “shop” table, save into a new text file only the records for customers, who paid using Visa card.
write.table
- Calculate mean Sales for men and women.
mean
- Buld a table showing mean Sales for single/married men/women (contigency table).
mean
- Download and open
beer.xlsx
file from http://edu.modas.lu/data/xls/beer.xlsx. Build a contigency table using fucntiontable()
to check beer preferences among men and women in the study.
download.file
,read_excel
,table
f*. Export the contigency table (e) into a CSV file.
table
,unclass
,as.data.frame
,write.table
Prev Home Next |