# DATA OBJECTS IN R

7

[1] "data.frame"

Objects of class data.frame represent data the traditional table oriented way. Each row is associated with one single observation and each column corre- sponds to one variable. The dimensions of such a table can be extracted using the dim function

# R> dim(Forbes2000)

[1] 2000

8

Alternatively, the numbers of rows and columns can be found using R> nrow(Forbes2000)

[1] 2000

# R> ncol(Forbes2000)

[1] 8

The results of both statements show that Forbes2000 has 2000 rows, i.e., observations, the companies in our case, with eight variables describing the observations. The variable names are accessible from

[1] "rank"

"name"

"country"

"category"

[5] "sales"

"profits"

"assets"

"marketvalue"

# R> names(Forbes2000)

The values of single variables can be extracted from the Forbes2000 object by their names, for example the ranking of the companies

# R> class(Forbes2000[,"rank"])

[1] "integer"

is stored as an integer variable. Brackets [] always indicate a subset of a larger object, in our case a single variable extracted from the whole table. Because data.frames have two dimensions, observations and variables, the comma is required in order to specify that we want a subset of the second dimension, i.e., the variables. The rankings for all 2000 companies are represented in a vector structure the length of which is given by

# R> length(Forbes2000[,"rank"])

[1] 2000

A vector is the elementary structure for data handling in R and is a set of simple elements, all being objects of the same class. For example, a simple vector of the numbers one to three can be constructed by one of the following commands

# R> 1:3

[1] 1 2 3

# R> c(1,2,3)

[1] 1 2 3

# R> seq(from = 1, to = 3, by = 1)

[1] 1 2 3