Data for Statistical Insight Chapter 5

Homes

Format

A data frame/tibble with 65 observations on the four variables

city

a character variable with values Akron OH, Albuquerque NM, Anaheim CA, Atlanta GA, Baltimore MD, Baton Rouge LA, Birmingham AL, Boston MA, Bradenton FL, Buffalo NY, Charleston SC, Chicago IL, Cincinnati OH, Cleveland OH, Columbia SC, Columbus OH, Corpus Christi TX, Dallas TX, Daytona Beach FL, Denver CO, Des Moines IA, Detroit MI, El Paso TX, Grand Rapids MI, Hartford CT, Honolulu HI, Houston TX, Indianapolis IN, Jacksonville FL, Kansas City MO, Knoxville TN, Las Vegas NV, Los Angeles CA, Louisville KY, Madison WI, Memphis TN, Miami FL, Milwaukee WI, Minneapolis MN, Mobile AL, Nashville TN, New Haven CT, New Orleans LA, New York NY, Oklahoma City OK, Omaha NE, Orlando FL, Philadelphia PA, Phoenix AZ, Pittsburgh PA, Portland OR, Providence RI, Sacramento CA, Salt Lake City UT, San Antonio TX, San Diego CA, San Francisco CA, Seattle WA, Spokane WA, St Louis MO, Syracuse NY, Tampa FL, Toledo OH, Tulsa OK, and Washington DC

region

a character variable with values Midwest, Northeast, South, and West

year

a factor with levels 1994 and 2000

price

median house price (in dollars)

Source

National Association of Realtors.

References

Kitchens, L. J. (2003) Basic Statistics and Data Analysis. Pacific Grove, CA: Brooks/Cole, a division of Thomson Learning.

Examples


tapply(Homes$price, Homes$year, mean)
#>     1994     2000 
#> 107158.5 136493.8 
tapply(Homes$price, Homes$region, mean)
#>   Midwest Northeast     South      West 
#> 106402.94 133309.09  93769.57 177625.00 
p2000 <- subset(Homes, year == "2000")
p1994 <- subset(Homes, year == "1994")
if (FALSE) {
library(dplyr)
library(ggplot2)
dplyr::group_by(Homes, year, region) %>%
   summarize(AvgPrice = mean(price))
ggplot2::ggplot(data = Homes, aes(x = region, y = price)) + 
           geom_boxplot() + 
           theme_bw() + 
           facet_grid(year ~ .)
}