Problem Statement
It is the summer of 2018. The head scout of your club (one of the biggest in the Premier League) wants to scout for a particular position from anywhere in the top 5 leagues using Wyscout data.
Forward, Chelsea
He is interested in hearing about a variety of methods for scouting players, so he asks you to try out one of the methods from the course or even one of your own. You should primarily use the Wyscout data but can also use Transfermarkt to find the prices of players in summer 2018.
You should do the following steps:
1, Implement one of the methods (plus/minus, percentiles, Markov chain, possession chains, or one of your own) on the data. Decide on a suitable metric for ranking players and make a top-10 list of players in your position for the whole league. Write a simple non-technical text (half a page) explaining your metric to the scout and what assumptions it makes. Provide a separate runnable piece of code (preferably in Python) which when put in the directory of the Wyscout data creates the top 10. This code must be a single file and only uses standard libraries (mplsoccer is fine)
2, Use your metric to find a single player in another league (not the Premier League), who you would recommend signing. Add additional statistics and visualizations to explain the strengths and weaknesses of that player (these can use World Cup data where appropriate). Create a two-page report, in a poster style, with as many visualizations as you want but max 2 pages on that player.
The metric of choice
I used the G-xG(Goals minus Expected Goals) metric for ranking forwards in the Premier League and to see how our forwards compare to other strikers in the league. xG is a good indicator for a forward’s ability to find the right place to receive a pass so that the striker has a higher chance of scoring a goal. However, xG is not a good indicator of a player’s form or how well does a striker finishes his chances. But if we calculate the xG and subtract it from the goals we would get a metric that does show us the player’s ability to create great shooting opportunities while also scoring these chances. The higher G-xG the better the player is, for example, if a player scores 5 goals out of 6 chances that have a total of 4.5 xG that means that the player has a positive G-xG of 0.5. We can also have a negative G-xG which is the case for two of our 3 forwards. I have removed all shots that are not from open play and headers so I only included shots and goals scored with either foot. Another assumption I made is that the xG model uses the distance from the goal and the angle of the shot to calculate the xG. Last season(2017/2018), Mohamed Salah was at the top of the G-xG list with an outstanding 12.2 G-xG surpassing all the other forwards in the league. Followed by Aubaymeng and Firimino who have 4.9,4.7 respectively. Our forwards, Eden Hazard, Alvaro Morata, Oliver Giroud ranked 5th, 75th, 61st. Morata is the second-worst G-xG in the whole league with -3.2 scoring only 4 goals out of 7.2 xG chances from open play followed by Chrisitan Benteke who followed to score any goals from an xG of 5. Eden Hazard G-xG is 3.9 higher than other forwards like Jamie Vardy (3.2), Raheem Sterling(3.1), and Sadio Mane(2.2) while Giroud had a G-xG of -1.0 which looks bad on paper but considering he did not play as many games as Morata it is a pretty okay score.
No comments:
Post a Comment