Dataset Source: https://www.kaggle.com/nowke9/ipldata
There are two datasets named “matches” and “deliveries”, I will be analyzing both.
Cities hosting IPL Matches (Top 10)

Choice at the Toss (Bating or Fielding)

Most number of Man of the Match Awards
This graph is made by MS Excel but data cleaned and manipulated using -R.
Team that has won most matches

We should not forget the fact that Chennai Super Kings (CSK) & Rajasthan Royals (RR) were banned from participating in IPL 2017 & 2018. Deccan Chargers, Gujarat Lions, Pune Warriors, Rising Pune Supergiant, Kochi Tuskers Kerala no longer play. Sunrisers Hyderabad played their inaugural match on 2013.
Batsman with most IPL runs
| SN | Batsman | Total Runs (2008-2018) |
| 1 | SK Raina | 5015 |
| 2 | V Kohli | 4962 |
| 3 | RG Sharma | 4504 |
| 4 | G Gambhir | 4223 |
| 5 | RV Uthappa | 4144 |
| 6 | S Dhawan | 4090 |
| 7 | MS Dhoni | 4041 |
| 8 | CH Gayle | 4037 |
| 9 | DA Warner | 4014 |
| 10 | AB de Villiers | 3974 |
R code for obtaining the above table:
runs = aggregate(deliveries$batsman_runs~deliveries$batsman, FUN = sum)
runs = runs [order(runs$`deliveries$batsman_runs`, decreasing = T), ]
colnames(runs) = c (“Batsman”, “Total Runs (2008-2018”)
head (runs, n = 10)
Bowlers with most number of Wickets
| SN | Bowler | Numer of Wickets |
| 1 | SL Malinga | 170 |
| 2 | A Mishra | 155 |
| 3 | DJ Bravo | 155 |
| 4 | PP Chawla | 146 |
| 5 | Harbhajan Singh | 143 |
| 6 | B Kumar | 127 |
| 7 | R Vinay Kumar | 127 |
| 8 | UT Yadav | 127 |
| 9 | SP Narine | 126 |
| 10 | A Nehra | 121 |
R code for obtaining the above table:
dismissed = data.frame(deliveries$bowler, deliveries$player_dismissed)
dismissed = data.frame(deliveries$bowler, as.numeric(deliveries$player_dismissed))
colnames(dismissed) = c (“Bowler”, “Dismissed”)
dismissed = subset(dismissed, dismissed$Dismissed > 1)
dismissed = data.frame(table(dismissed$Bowler))
dismissed = dismissed[order(dismissed$Freq, decreasing = T),]
colnames(dismissed) = c (“Bowler”, “Number of Wickets”)
head (dismissed, n = 10)
View(dismissed)
Batsman with most number of Sixes
| SN | Batsman | No of Sixes |
| 1 | CH Gayle | 293 |
| 2 | AB de Villiers | 188 |
| 3 | MS Dhoni | 186 |
| 4 | SK Raina | 186 |
| 5 | RG Sharma | 185 |
> six = subset(deliveries, deliveries$batsman_runs == 6 | deliveries$batsman_runs == 7)
> six = aggregate(six$batsman_runs~six$batsman, FUN = sum)
> six = six [order(six$`six$batsman_runs`, decreasing = T),]
> six$no_six = round(six$`six$batsman_runs`/6, digits = 0)
> six$`six$batsman_runs` = NULL
> colnames(six) = c (“Batsman”, “No of Sixes”)
> head(six, n = 5)
Batsman with most number of Fours
| SN | Batsman | No of Fours |
| 1 | G Gambhir | 492 |
| 2 | S Dhawan | 465 |
| 3 | SK Raina | 449 |
| 4 | V Kohli | 436 |
| 5 | RV Uthappa | 412 |
R code for obtaining the table above
four = subset(deliveries, deliveries$batsman_runs == 4 | deliveries$batsman_runs == 5)
four = aggregate(four$batsman_runs~four$batsman, FUN = sum)
four = four [order(four$`four$batsman_runs`, decreasing = T),]
four$no_fours = round(four$`four$batsman_runs`/4, digits = 0)
four$`four$batsman_runs` = NULL
colnames(four) = c (“Batsman”, “No of Fours”)
head(four, n = 5)
view(four)
Dataset Source: https://www.kaggle.com/nowke9/ipldata
For queries and suggestions:
Email: sulovekoirala@gmail.com
