IPL 2008-2018 Data Analysis

Dataset Source: https://www.kaggle.com/nowke9/ipldata

There are two datasets named “matches” and “deliveries”, I will be analyzing both.

Cities hosting IPL Matches (Top 10)

host city

Choice at the Toss (Bating or Fielding)

toss

Most number of Man of the Match Awards

momThis graph is made by MS Excel but data cleaned and manipulated using -R.

Team that has won most matches
ipl teams

We should not forget the fact that Chennai Super Kings (CSK) & Rajasthan Royals (RR) were banned from participating in IPL 2017 & 2018. Deccan Chargers, Gujarat Lions, Pune Warriors, Rising Pune Supergiant, Kochi Tuskers Kerala no longer play. Sunrisers Hyderabad played their inaugural match on 2013.

Batsman with most IPL runs

SN Batsman Total Runs (2008-2018)
1 SK Raina 5015
2 V Kohli 4962
3 RG Sharma 4504
4 G Gambhir 4223
5 RV Uthappa 4144
6 S Dhawan 4090
7 MS Dhoni 4041
8 CH Gayle 4037
9 DA Warner 4014
10 AB de Villiers 3974

R code for obtaining the above table:
runs = aggregate(deliveries$batsman_runs~deliveries$batsman, FUN = sum)
runs = runs [order(runs$`deliveries$batsman_runs`, decreasing = T), ]
colnames(runs) = c (“Batsman”, “Total Runs (2008-2018”)
head (runs, n = 10)

Bowlers with most number of Wickets

SN Bowler Numer of Wickets
1 SL Malinga 170
2 A Mishra 155
3 DJ Bravo 155
4 PP Chawla 146
5 Harbhajan Singh 143
6 B Kumar 127
7 R Vinay Kumar 127
8 UT Yadav 127
9 SP Narine 126
10 A Nehra 121

R code for obtaining the above table:
dismissed = data.frame(deliveries$bowler, deliveries$player_dismissed)
dismissed = data.frame(deliveries$bowler, as.numeric(deliveries$player_dismissed))
colnames(dismissed) = c (“Bowler”, “Dismissed”)
dismissed = subset(dismissed, dismissed$Dismissed > 1)
dismissed = data.frame(table(dismissed$Bowler))
dismissed = dismissed[order(dismissed$Freq, decreasing = T),]
colnames(dismissed) = c (“Bowler”, “Number of Wickets”)
head (dismissed, n = 10)
View(dismissed)

Batsman with most number of Sixes

SN Batsman No of Sixes
1 CH Gayle 293
2 AB de Villiers 188
3 MS Dhoni 186
4 SK Raina 186
5 RG Sharma 185

> six = subset(deliveries, deliveries$batsman_runs == 6 | deliveries$batsman_runs == 7)
> six = aggregate(six$batsman_runs~six$batsman, FUN = sum)
> six = six [order(six$`six$batsman_runs`, decreasing = T),]
> six$no_six = round(six$`six$batsman_runs`/6, digits = 0)
> six$`six$batsman_runs` = NULL
> colnames(six) = c (“Batsman”, “No of Sixes”)
> head(six, n = 5)

Batsman with most number of Fours

SN Batsman No of Fours
1 G Gambhir 492
2 S Dhawan 465
3 SK Raina 449
4 V Kohli 436
5 RV Uthappa 412

R  code for obtaining the table above
four = subset(deliveries, deliveries$batsman_runs == 4 | deliveries$batsman_runs == 5)
four = aggregate(four$batsman_runs~four$batsman, FUN = sum)
four = four [order(four$`four$batsman_runs`, decreasing = T),]
four$no_fours = round(four$`four$batsman_runs`/4, digits = 0)
four$`four$batsman_runs` = NULL
colnames(four) = c (“Batsman”, “No of Fours”)
head(four, n = 5)
view(four)

Dataset Source: https://www.kaggle.com/nowke9/ipldata

For queries and suggestions:

Email: sulovekoirala@gmail.com

Leave a comment