X hits on this document

60 views

0 shares

0 downloads

0 comments

10 / 23

8

AN INTRODUCTION TO R

The unique names of all 2000 companies are stored in a character vector R> class(Forbes2000[,"name"])

[1] "character"

R> length(Forbes2000[,"name"])

[1] 2000

and the first element of this vector is R> Forbes2000[,"name"][1]

[1] "Citigroup"

Because the companies are ranked, Citigroup is the world’s largest company according to the Forbes 2000 list. Further details on vectors and subsetting are given in Section 1.6.

Nominal measurements are represented by factor variables in R, such as the category of the company’s business segment

R> class(Forbes2000[,"category"])

[1] "factor"

Objects of class factor and character basically differ in the way their values are stored internally. Each element of a vector of class character is stored as a character variable whereas an integer variable indicating the level of a factor is saved for factor objects. In our case, there are

R> nlevels(Forbes2000[,"category"])

[1] 27

different levels, i.e., business categories, which can be extracted by R> levels(Forbes2000[,"category"])

[1] [2] [3]

" Aerospace & defense "Banking" "Business services & supplies" "

...

As a simple summary statistic, the frequencies of the levels of such a factor variable can be found from

R> table(Forbes2000[,"category"])

Aerospace & defense 19 Business services & supplies 70

Banking 313

...

The sales, assets, profits and market value variables are of type numeric, the natural data type for continuous or discrete measurements, for example

R> class(Forbes2000[,"sales"])

[1] "numeric"

and simple summary statistics such as the mean, median and range can be found from

Document info
Document views60
Page views60
Page last viewedSat Dec 10 17:01:37 UTC 2016
Pages23
Paragraphs684
Words5996

Comments