X hits on this document

53 views

0 shares

0 downloads

0 comments

13 / 23

BASIC DATA MANIPULATION

11

where the extension .rda is standard. We can get the file names of all files with extension .rda from the working directory

R> list.files(pattern = "\\.rda")

[1] "Forbes2000.rda"

and we can load the contents of the file into R by R> load("Forbes2000.rda")

1.6 Basic Data Manipulation

The examples shown in the previous section have illustrated the importance of data.frames for storing and handling tabular data in R. Internally, a data.frame is a list of vectors of a common length n, the number of rows of the table. Each of those vectors represents the measurements of one variable and we have seen that we can access such a variable by its name, for example the names of the companies

R> companies <- Forbes2000[,"name"]

Of course, the companies vector is of class character and of length 2000. A subset of the elements of the vector companies can be extracted using the [] subset operator. For example, the largest of the 2000 companies listed in the Forbes 2000 list is

R> companies[1]

[1] "Citigroup"

and the top three companies can be extracted utilising an integer vector of the numbers one to three:

R> 1:3

[1] 1 2 3

R> companies[1:3]

[1] [3]

" " Citigroup "American Intl Group"

"General Electric"

In contrast to indexing with positive integers, negative indexing returns all elements which are not part of the index vector given in brackets. For example, all companies except those with numbers four to two-thousand, i.e., the top three companies, are again

R> companies[-(4:2000)]

[1] [3]

" " Citigroup "American Intl Group"

"General Electric"

The complete information about the top three companies can be printed in a similar way. Because data.frames have a concept of rows and columns, we need to separate the subsets corresponding to rows and columns by a comma.

The statement R> Forbes2000[1:3, c("name",

"sales",

"profits",

"assets")]

Document info
Document views53
Page views53
Page last viewedThu Dec 08 08:37:47 UTC 2016
Pages23
Paragraphs684
Words5996

Comments