Overview

Dataset statistics

Number of variables12
Number of observations890
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory60.0 KiB
Average record size in memory69.0 B

Variable types

Numeric5
Categorical7

Alerts

Survived is highly overall correlated with Sex_female and 1 other fieldsHigh correlation
Embarked_C is highly overall correlated with Embarked_SHigh correlation
Embarked_S is highly overall correlated with Embarked_CHigh correlation
Sex_female is highly overall correlated with Survived and 1 other fieldsHigh correlation
Sex_male is highly overall correlated with Survived and 1 other fieldsHigh correlation
Embarked_Q is highly imbalanced (57.5%)Imbalance
PassengerId is uniformly distributedUniform
PassengerId has unique valuesUnique
SibSp has 607 (68.2%) zerosZeros
Parch has 677 (76.1%) zerosZeros
Fare has 15 (1.7%) zerosZeros

Reproduction

Analysis started2023-06-20 12:18:34.653297
Analysis finished2023-06-20 12:18:38.180776
Duration3.53 seconds
Software versionydata-profiling vv4.2.0
Download configurationconfig.json

Variables

PassengerId
Real number (ℝ)

UNIFORM  UNIQUE 

Distinct890
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean445.79213
Minimum1
Maximum891
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.9 KiB
2023-06-20T14:18:38.268808image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile45.45
Q1223.25
median445.5
Q3668.75
95-th percentile846.55
Maximum891
Range890
Interquartile range (IQR)445.5

Descriptive statistics

Standard deviation257.4237
Coefficient of variation (CV)0.5774523
Kurtosis-1.2002272
Mean445.79213
Median Absolute Deviation (MAD)223
Skewness0.0020094532
Sum396755
Variance66266.959
MonotonicityNot monotonic
2023-06-20T14:18:38.399889image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
446 1
 
0.1%
133 1
 
0.1%
181 1
 
0.1%
183 1
 
0.1%
185 1
 
0.1%
161 1
 
0.1%
160 1
 
0.1%
159 1
 
0.1%
158 1
 
0.1%
129 1
 
0.1%
Other values (880) 880
98.9%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
891 1
0.1%
890 1
0.1%
889 1
0.1%
888 1
0.1%
887 1
0.1%
886 1
0.1%
885 1
0.1%
884 1
0.1%
883 1
0.1%
882 1
0.1%

Survived
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size13.9 KiB
0
549 
1
341 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters890
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0 549
61.7%
1 341
38.3%

Length

2023-06-20T14:18:38.524494image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-20T14:18:38.638039image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0 549
61.7%
1 341
38.3%

Most occurring characters

ValueCountFrequency (%)
0 549
61.7%
1 341
38.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 890
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 549
61.7%
1 341
38.3%

Most occurring scripts

ValueCountFrequency (%)
Common 890
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 549
61.7%
1 341
38.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 890
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 549
61.7%
1 341
38.3%

Pclass
Categorical

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size13.9 KiB
3
491 
1
215 
2
184 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters890
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
3 491
55.2%
1 215
24.2%
2 184
 
20.7%

Length

2023-06-20T14:18:38.730615image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-20T14:18:38.844160image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
3 491
55.2%
1 215
24.2%
2 184
 
20.7%

Most occurring characters

ValueCountFrequency (%)
3 491
55.2%
1 215
24.2%
2 184
 
20.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 890
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 491
55.2%
1 215
24.2%
2 184
 
20.7%

Most occurring scripts

ValueCountFrequency (%)
Common 890
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 491
55.2%
1 215
24.2%
2 184
 
20.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 890
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 491
55.2%
1 215
24.2%
2 184
 
20.7%

Age
Real number (ℝ)

Distinct97
Distinct (%)10.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.276088
Minimum0.42
Maximum74
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.9 KiB
2023-06-20T14:18:38.961792image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0.42
5-th percentile6
Q121
median27.255814
Q336
95-th percentile54
Maximum74
Range73.58
Interquartile range (IQR)15

Descriptive statistics

Standard deviation13.266279
Coefficient of variation (CV)0.45314384
Kurtosis0.52458789
Mean29.276088
Median Absolute Deviation (MAD)7.744186
Skewness0.45061974
Sum26055.718
Variance175.99416
MonotonicityNot monotonic
2023-06-20T14:18:39.097918image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
27.25581395 85
 
9.6%
24 30
 
3.4%
22 27
 
3.0%
18 26
 
2.9%
30 25
 
2.8%
19.32978723 25
 
2.8%
28 25
 
2.8%
19 25
 
2.8%
21 24
 
2.7%
25 23
 
2.6%
Other values (87) 575
64.6%
ValueCountFrequency (%)
0.42 1
 
0.1%
0.67 1
 
0.1%
0.75 2
 
0.2%
0.83 2
 
0.2%
0.92 1
 
0.1%
1 7
0.8%
2 10
1.1%
3 6
0.7%
4 10
1.1%
5 4
 
0.4%
ValueCountFrequency (%)
74 1
 
0.1%
71 2
0.2%
70.5 1
 
0.1%
70 2
0.2%
66 1
 
0.1%
65 3
0.3%
64 2
0.2%
63 2
0.2%
62 4
0.4%
61 3
0.3%

SibSp
Real number (ℝ)

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.52359551
Minimum0
Maximum8
Zeros607
Zeros (%)68.2%
Negative0
Negative (%)0.0%
Memory size13.9 KiB
2023-06-20T14:18:39.218473image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum8
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.1032239
Coefficient of variation (CV)2.1070156
Kurtosis17.859695
Mean0.52359551
Median Absolute Deviation (MAD)0
Skewness3.6932052
Sum466
Variance1.2171029
MonotonicityNot monotonic
2023-06-20T14:18:39.312955image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 607
68.2%
1 209
 
23.5%
2 28
 
3.1%
4 18
 
2.0%
3 16
 
1.8%
8 7
 
0.8%
5 5
 
0.6%
ValueCountFrequency (%)
0 607
68.2%
1 209
 
23.5%
2 28
 
3.1%
3 16
 
1.8%
4 18
 
2.0%
5 5
 
0.6%
8 7
 
0.8%
ValueCountFrequency (%)
8 7
 
0.8%
5 5
 
0.6%
4 18
 
2.0%
3 16
 
1.8%
2 28
 
3.1%
1 209
 
23.5%
0 607
68.2%

Parch
Real number (ℝ)

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.38202247
Minimum0
Maximum6
Zeros677
Zeros (%)76.1%
Negative0
Negative (%)0.0%
Memory size13.9 KiB
2023-06-20T14:18:39.411466image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.80640878
Coefficient of variation (CV)2.1108936
Kurtosis9.7643575
Mean0.38202247
Median Absolute Deviation (MAD)0
Skewness2.7471392
Sum340
Variance0.65029512
MonotonicityNot monotonic
2023-06-20T14:18:39.500591image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 677
76.1%
1 118
 
13.3%
2 80
 
9.0%
3 5
 
0.6%
5 5
 
0.6%
4 4
 
0.4%
6 1
 
0.1%
ValueCountFrequency (%)
0 677
76.1%
1 118
 
13.3%
2 80
 
9.0%
3 5
 
0.6%
4 4
 
0.4%
5 5
 
0.6%
6 1
 
0.1%
ValueCountFrequency (%)
6 1
 
0.1%
5 5
 
0.6%
4 4
 
0.4%
3 5
 
0.6%
2 80
 
9.0%
1 118
 
13.3%
0 677
76.1%

Fare
Real number (ℝ)

Distinct248
Distinct (%)27.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.206685
Minimum0
Maximum512.3292
Zeros15
Zeros (%)1.7%
Negative0
Negative (%)0.0%
Memory size13.9 KiB
2023-06-20T14:18:39.624108image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile7.225
Q17.9031
median14.4542
Q331
95-th percentile112.19873
Maximum512.3292
Range512.3292
Interquartile range (IQR)23.0969

Descriptive statistics

Standard deviation49.721315
Coefficient of variation (CV)1.5438197
Kurtosis33.3567
Mean32.206685
Median Absolute Deviation (MAD)6.9042
Skewness4.7845046
Sum28663.949
Variance2472.2091
MonotonicityNot monotonic
2023-06-20T14:18:39.763657image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8.05 43
 
4.8%
13 42
 
4.7%
7.8958 38
 
4.3%
7.75 34
 
3.8%
26 31
 
3.5%
10.5 24
 
2.7%
7.925 18
 
2.0%
7.775 16
 
1.8%
26.55 15
 
1.7%
0 15
 
1.7%
Other values (238) 614
69.0%
ValueCountFrequency (%)
0 15
1.7%
4.0125 1
 
0.1%
5 1
 
0.1%
6.2375 1
 
0.1%
6.4375 1
 
0.1%
6.45 1
 
0.1%
6.4958 2
 
0.2%
6.75 2
 
0.2%
6.8583 1
 
0.1%
6.95 1
 
0.1%
ValueCountFrequency (%)
512.3292 3
0.3%
263 4
0.4%
262.375 2
0.2%
247.5208 2
0.2%
227.525 4
0.4%
221.7792 1
 
0.1%
211.5 1
 
0.1%
211.3375 3
0.3%
164.8667 2
0.2%
153.4625 3
0.3%

Embarked_C
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size13.9 KiB
0
722 
1
168 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters890
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0 722
81.1%
1 168
 
18.9%

Length

2023-06-20T14:18:39.883802image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-20T14:18:39.995347image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0 722
81.1%
1 168
 
18.9%

Most occurring characters

ValueCountFrequency (%)
0 722
81.1%
1 168
 
18.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 890
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 722
81.1%
1 168
 
18.9%

Most occurring scripts

ValueCountFrequency (%)
Common 890
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 722
81.1%
1 168
 
18.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 890
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 722
81.1%
1 168
 
18.9%

Embarked_Q
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size13.9 KiB
0
813 
1
 
77

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters890
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 813
91.3%
1 77
 
8.7%

Length

2023-06-20T14:18:40.089859image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-20T14:18:40.548162image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0 813
91.3%
1 77
 
8.7%

Most occurring characters

ValueCountFrequency (%)
0 813
91.3%
1 77
 
8.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 890
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 813
91.3%
1 77
 
8.7%

Most occurring scripts

ValueCountFrequency (%)
Common 890
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 813
91.3%
1 77
 
8.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 890
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 813
91.3%
1 77
 
8.7%

Embarked_S
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size13.9 KiB
1
645 
0
245 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters890
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
1 645
72.5%
0 245
 
27.5%

Length

2023-06-20T14:18:40.641736image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-20T14:18:40.752251image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
1 645
72.5%
0 245
 
27.5%

Most occurring characters

ValueCountFrequency (%)
1 645
72.5%
0 245
 
27.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 890
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 645
72.5%
0 245
 
27.5%

Most occurring scripts

ValueCountFrequency (%)
Common 890
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 645
72.5%
0 245
 
27.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 890
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 645
72.5%
0 245
 
27.5%

Sex_female
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size13.9 KiB
0
576 
1
314 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters890
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0 576
64.7%
1 314
35.3%

Length

2023-06-20T14:18:40.845797image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-20T14:18:40.956309image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0 576
64.7%
1 314
35.3%

Most occurring characters

ValueCountFrequency (%)
0 576
64.7%
1 314
35.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 890
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 576
64.7%
1 314
35.3%

Most occurring scripts

ValueCountFrequency (%)
Common 890
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 576
64.7%
1 314
35.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 890
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 576
64.7%
1 314
35.3%

Sex_male
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size13.9 KiB
1
576 
0
314 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters890
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
1 576
64.7%
0 314
35.3%

Length

2023-06-20T14:18:41.048823image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-20T14:18:41.160337image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
1 576
64.7%
0 314
35.3%

Most occurring characters

ValueCountFrequency (%)
1 576
64.7%
0 314
35.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 890
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 576
64.7%
0 314
35.3%

Most occurring scripts

ValueCountFrequency (%)
Common 890
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 576
64.7%
0 314
35.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 890
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 576
64.7%
0 314
35.3%

Interactions

2023-06-20T14:18:37.319160image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:35.071364image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:35.609610image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:36.178199image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:36.753767image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:37.424728image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:35.171725image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:35.716215image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:36.289420image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:36.859380image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:37.537241image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:35.284855image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:35.833730image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:36.407018image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:36.976925image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:37.653814image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:35.399434image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:35.955918image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:36.527620image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:37.098029image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:37.763363image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:35.506039image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:36.068092image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:36.641197image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-20T14:18:37.210541image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Correlations

2023-06-20T14:18:41.248851image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
PassengerIdAgeSibSpParchFareSurvivedPclassEmbarked_CEmbarked_QEmbarked_SSex_femaleSex_male
PassengerId1.0000.049-0.0610.002-0.0150.1050.0290.0000.0000.0000.0640.064
Age0.0491.000-0.167-0.2340.1620.2390.3430.0230.1550.0260.1540.154
SibSp-0.061-0.1671.0000.4500.4480.1880.1480.1110.0740.0830.2050.205
Parch0.002-0.2340.4501.0000.4110.1580.0220.0000.0770.0060.2470.247
Fare-0.0150.1620.4480.4111.0000.2840.4810.2730.0920.1680.1880.188
Survived0.1050.2390.1880.1580.2841.0000.3350.1630.0000.1440.5420.542
Pclass0.0290.3430.1480.0220.4810.3351.0000.2970.2330.2170.1310.131
Embarked_C0.0000.0230.1110.0000.2730.1630.2971.0000.1390.7790.0720.072
Embarked_Q0.0000.1550.0740.0770.0920.0000.2330.1391.0000.4940.0610.061
Embarked_S0.0000.0260.0830.0060.1680.1440.2170.7790.4941.0000.1110.111
Sex_female0.0640.1540.2050.2470.1880.5420.1310.0720.0610.1111.0000.998
Sex_male0.0640.1540.2050.2470.1880.5420.1310.0720.0610.1110.9981.000

Missing values

2023-06-20T14:18:37.915099image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-06-20T14:18:38.105194image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

PassengerIdSurvivedPclassAgeSibSpParchFareEmbarked_CEmbarked_QEmbarked_SSex_femaleSex_male
445446114.0000000281.858300101
3103111124.0000000083.158310010
3093101130.0000000056.929210010
3073081117.00000010108.900010010
3063071134.93902400110.883310010
305306110.92000012151.550000101
7107111124.0000000049.504210010
7117120144.5819670026.550000101
3113121118.00000022262.375010010
7127131148.0000001052.000000101
PassengerIdSurvivedPclassAgeSibSpParchFareEmbarked_CEmbarked_QEmbarked_SSex_femaleSex_male
3683691319.329787007.750001010
3723730319.000000008.050000101
374375033.0000003121.075000110
3763771322.000000007.250000110
3783790320.000000004.012510001
3793800319.000000007.775000101
381382131.0000000215.741710010
3823830332.000000007.925000101
3713720318.000000106.495800101
8908910332.000000007.750001001