X hits on this document

66 views

0 shares

15 / 23

SIMPLE SUMMARY STATISTICS

13

R> na_profits <- is.na(Forbes2000\$profits) R> table(na_profits)

na_profits

FALSE

TRUE

1995

5

R> Forbes2000[na_profits,

+

c("name",

"sales",

"profits",

"assets")]

name sales profits assets

772 1085 1091

NA

42.94

NA

51.65

NA

10.59

AMP

5.40

HHG

5.68

NTL

3.50

1425

US Airways Group

1909 Laidlaw International

5.50 4.48

NA

8.58

NA

3.98

where the function is.na returns a logical vector being TRUE when the corre- sponding element of the supplied vector is NA. A more comfortable approach is available when we want to remove all observations with at least one miss- ing value from a data.frame object. The function complete.cases takes a data.frame and returns a logical vector being TRUE when the corresponding observation does not contain any missing value:

R> table(complete.cases(Forbes2000))

FALSE

TRUE

5

1995

Subsetting data.frames driven by logical expressions may induce a lot of typing which can be avoided. The subset function takes a data.frame as first argument and a logical expression as second argument. For example, we can select a subset of the Forbes 2000 list consisting of all companies situated in the United Kingdom by

R> UKcomp <- subset(Forbes2000, country == "United Kingdom") R> dim(UKcomp)

[1] 137

8

i.e., 137 of the 2000 companies are from the UK. Note that it is not neces- sary to extract the variable country from the data.frame Forbes2000 when formulating the logical expression.

1.7 Simple Summary Statistics

Two functions are helpful for getting an overview about R objects: str and summary, where str is more detailed about data types and summary gives a collection of sensible summary statistics. For example, applying the summary method to the Forbes2000 data set,

R> summary(Forbes2000) results in the following output

rank Min. : 1.0 1st Qu.: 500.8 Median :1000.5

name Length:2000

Class

:character

Mode

:character

country United States :751 Japan :316 United Kingdom:137

Mean

:1000.5

Germany

: 65

 Document views 66 Page views 66 Page last viewed Tue Jan 17 02:56:05 UTC 2017 Pages 23 Paragraphs 684 Words 5996