We would like to understand how people rate themselves across five attributes (attractiveness, sincerity, intelligence, fun and ambition) and how this compares to the ratings they get from their dating partners.
We used the data from speed dating experiment conducted at Columbia Business School available on kaggle. Refer to docummentation for more details about used attributes or jump to the segmentation section for short description.
This is how the first 5 out of the total of 5656 rows look:
01 | 02 | 03 | 04 | 05 | |
---|---|---|---|---|---|
attr_o | 6 | 7 | 10 | 7 | 8 |
sinc_o | 8 | 8 | 10 | 8 | 7 |
intel_o | 8 | 10 | 10 | 9 | 9 |
fun_o | 8 | 7 | 10 | 8 | 6 |
amb_o | 8 | 7 | 10 | 9 | 9 |
shar_o | 6 | 5 | 10 | 8 | 7 |
field_cd | 1 | 1 | 1 | 1 | 1 |
race | 4 | 4 | 4 | 4 | 4 |
income | 165 | 165 | 165 | 165 | 165 |
goal | 2 | 2 | 2 | 2 | 2 |
date | 7 | 7 | 7 | 7 | 7 |
go_out | 1 | 1 | 1 | 1 | 1 |
career | 220 | 220 | 220 | 220 | 220 |
career_c | |||||
sports | 9 | 9 | 9 | 9 | 9 |
tvsports | 2 | 2 | 2 | 2 | 2 |
exercise | 8 | 8 | 8 | 8 | 8 |
dining | 9 | 9 | 9 | 9 | 9 |
museums | 1 | 1 | 1 | 1 | 1 |
art | 1 | 1 | 1 | 1 | 1 |
hiking | 5 | 5 | 5 | 5 | 5 |
gaming | 1 | 1 | 1 | 1 | 1 |
clubbing | 5 | 5 | 5 | 5 | 5 |
reading | 6 | 6 | 6 | 6 | 6 |
tv | 9 | 9 | 9 | 9 | 9 |
theater | 1 | 1 | 1 | 1 | 1 |
movies | 10 | 10 | 10 | 10 | 10 |
concerts | 10 | 10 | 10 | 10 | 10 |
music | 9 | 9 | 9 | 9 | 9 |
shopping | 8 | 8 | 8 | 8 | 8 |
yoga | 1 | 1 | 1 | 1 | 1 |
The “high level” process template to derive high level insights from the speed dating database is split in 3 parts:
Part 1: We use descrictive statistics (e.g. mean, standard deviation) to understand whether men and women tend to overvalue or undervalue themselves across the five studied attributes
Part 2: We run hypothesis testing to estimate whether we can accept or reject with 99% confidence that men and women overvalue themselves
Part 3: We plot the self-ratings of men and women against the ratings given by their dating-partners and we look for patterns of over/undervaluation across the two sexes
For each participant we estimated the difference between his/her self-rating and that of the dating partner for the total rating as well as for each of the five attributes. Then we computed the mean and st deviation of that difference for the total population and then for men and women separately. Looking at the fraction of people that overvalue or unervalue themselves can provide us with meaningful insights about self-perception and dating behavior
Overall_Avg_delta | % | Avg_Attr_Delta | Attr_% | Avg_Sinc_Delta | Sinc_% | Avg_Intel_Delta | Intel_% | Avg_Fun_Delta | Fun_% | Avg_Amb_Delta | Amb_% | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Overall | -0.8 | -0.67 | -0.64 | -0.87 | -0.91 | -0.91 | ||||||
Men | -0.77 | -0.75 | -0.51 | -0.92 | -0.89 | -0.77 | ||||||
Women | -0.83 | -0.61 | -0.76 | -0.83 | -0.93 | -1.03 | ||||||
SD Men | 1.68 | 2.19 | 2.42 | 2.03 | 2.31 | 2.4 | ||||||
SD Women | 1.75 | 2.29 | 2.34 | 2.01 | 2.46 | 2.44 | ||||||
Undervalued (Overall) | 2210 | 65.73% | 1741 | 51.78% | 1745 | 51.90% | 1914 | 56.93% | 1892 | 56.28% | 1924 | 57.23% |
Overvalued (Overall) | 1008 | 29.98% | 1019 | 30.31% | 1048 | 31.17% | 800 | 23.80% | 916 | 27.25% | 922 | 27.42% |
On spot (Overall) | 144 | 4.28% | 602 | 17.91% | 569 | 16.92% | 648 | 19.27% | 554 | 16.48% | 516 | 15.35% |
Undervalued (men) | 1037 | 65.10% | 836 | 52.48% | 784 | 49.22% | 931 | 58.44% | 891 | 55.93% | 868 | 54.49% |
Overvalued (men) | 480 | 30.13% | 453 | 28.44% | 536 | 33.65% | 355 | 22.28% | 431 | 27.06% | 465 | 29.19% |
On spot (men) | 76 | 4.77% | 304 | 19.08% | 273 | 17.14% | 307 | 19.27% | 271 | 17.01% | 260 | 16.32% |
Undervalued (women) | 1173 | 66.31% | 905 | 51.16% | 961 | 54.32% | 983 | 55.57% | 1001 | 56.59% | 1056 | 59.69% |
Overvalued (women) | 528 | 29.85% | 566 | 32.00% | 512 | 28.94% | 445 | 25.16% | 485 | 27.42% | 457 | 25.83% |
On spot (women) | 68 | 3.84% | 298 | 16.85% | 296 | 16.73% | 341 | 19.28% | 283 | 16.00% | 256 | 14.47% |
Next, we estimate the average score that men and women get from their dating partners’ and the importance of its attribute for mens’ and womens’ matching decision.
Attractiveness | Sincerity | Intelligence | Fun | Ambition | |
---|---|---|---|---|---|
Average Men's Score | 5.99 | 7.12 | 7.46 | 6.32 | 6.97 |
Average Women's Score | 6.49 | 7.25 | 7.27 | 6.51 | 6.63 |
Importance for Men | 28.48% | 16.30% | 19.12% | 17.78% | 8.18% |
Importance for Women | 18.69% | 18.49% | 20.78% | 17.13% | 12.06% |
Based on the above statistics, we can derive the following insights:
To validate the significance of the above inisghts, we run a one-sided hypothesis testing
Hypotheses:
H0: mu-delta >= 0
HA: mu-delta < 0 (meaning that self-rating is higher than partner’s rating, i.e. overrating)
Attractiveness | Sincerity | Intelligence | Fun | Ambition | |
---|---|---|---|---|---|
Mean (men) | -0.75 | -0.51 | -0.92 | -0.89 | -0.77 |
SD (men) | 2.19 | 2.42 | 2.03 | 2.31 | 2.4 |
Sample size (men) | 1593 | 1593 | 1593 | 1593 | 1593 |
t-value (men) | -20.219 | -13.091 | -25.901 | -23.406 | -19.849 |
Critical Value at 1% (men) | -1.646 | -1.646 | -1.646 | -1.646 | -1.646 |
Statistical Decision (men) | Reject | Reject | Reject | Reject | Reject |
Mean (women) | -0.61 | -0.76 | -0.83 | -0.93 | -1.03 |
SD (women) | 2.29 | 2.34 | 2.01 | 2.46 | 2.44 |
Sample size (women) | 1769 | 1769 | 1769 | 1769 | 1769 |
t-value (women) | -16.902 | -20.811 | -24.457 | -24.887 | -27.846 |
Critical Value at 1% (women) | -2.328 | -2.328 | -2.328 | -2.328 | -2.328 |
Statistical Decision (women) | Reject | Reject | Reject | Reject | Reject |
On the basis of these results, we conclude that men and women tend to overrate themselves across all five attributes.