The TITANIC3
data frame describes the survival status of individual passengers on the Titanic. The TITANIC3
data frame does not contain information for the crew, but it does contain actual and estimated ages for almost 80% of the passengers.
TITANIC3
A data frame with 1309 observations on the following 14 variables:
pclass
(a factor with levels 1st
, 2nd
, and 3rd
)
survived
(Survival where 0 = No; 1 = Yes)
name
(Name)
sex
(a factor with levels female
and male
)
age
(age in years)
sibsp
(Number of Siblings/Spouses Aboard)
parch
(Number of Parents/Children Aboard)
ticket
(Ticket Number)
fare
(Passenger Fare)
cabin
(Cabin)
embarked
(a factor with levels Cherbourg
, Queenstown
, and Southampton
)
boat
(Lifeboat Number)
body
(Body Identification Number)
home.dest
(Home/Destination)
Thomas Cason from the University of Virginia has greatly updated and improved the titanic
data frame using the Encyclopedia Titanica and created a new dataset called TITANIC3
. This dataset reflects the state of data available as of August 2, 1999. Some duplicate passengers have been dropped; many errors have been corrected; many missing ages have been filled in; and new variables have been created.
Harrell, F. E. 2001. Regression Modeling Strategies with Applications to Linear Models, Logistic Regression, and Survival Analysis. Springer.
Ugarte, M. D., Militino, A. F., and Arnholt, A. T. 2015. Probability and Statistics with R, Second Edition. Chapman & Hall / CRC.