I must say I sometimes get irritated by the overuse of acronyms in today’s world but this time I’ve created my own. TPOEM rather unimaginitively stands for The Power Of Eleven Model which I have been developing over the past few weeks.
TPOEM is the culmination of fairly light research into simple OPTA-derived football statistics that I have been analysing over the past 6 months or so. Having only really put the information together over the past week or so, it is a bit foolhardy to discuss TPOEM in any detail right now – but I have already begun using it to objectively rate player/team performance and even test its efficacy at predicting match results.
I will give some detail into how the model works. The first point of note is that it is a bottom-up system. That means that it primarily analyses player data first and team data second. There are many reasons I wanted to approach the analysis in this way:
- A focus on player statistics gives an objective view of a player’s importance to a team, and can help indicate which players contributed most/least to a team’s performance
- Player statistics like goals scored and assists are readily available and easily compared between players at different clubs
- TPOEM can potentially capture information that is useful to understanding team playing styles
- TPOEM can potentially be used to give a prediction of a match result based on the team starting line-ups, which will give a clearer expectation of a result if key players from either team are missing
Although TPOEM is derived from fairly simple statistics, the most recent iteration incorporates 36 statistics including stats from goals scored and shots on target to tackles and ground duels. I have weighted the utility of each action and applied success rates where available to give a rating in simplified categories:
- Defending/Ball winning
- Passing/Ball retention
Of course the overall scores are adjusted so that the most frequent actions (passing, touches, etc) do not grossly outweigh the less frequent, but arguably more important, actions such as shots on target and goals scored. At the same time, I tried to maintain some care over the relevance of goals as a statistic – of course goals win games, but why should TPOEM rate attackers more highly than defenders because they score more often? Strikers often take all the plaudits for scoring goals but since most goals are scored inside the box I have tried not to unduly credit a goal scored – in many instances it is easier to score a goal than miss. I took a similar view of assists, seeking not to overly ramp-up a player’s score simply because he completed a pass (however important it was). I have to stress that it still wasn’t quite a finger in the air approach to rating – I have reviewed correlations to team performance at various layers with the aim of giving my weightings a scientific basis.
I have now tinkered with the algorithms enough times to realise that although TPOEM in one sense gives an objective rating of player performance, but in another sense remains a reflection of its creator’s biases and research. This is limitation of any model, which can only be improved by testing and further research.
What about results? Well I will keep publishing information over the coming weeks as I look to find suitable ways of presenting TPOEM’s output.
For now, I have run the model on the first 271 games of the premier league season (i.e. before the kick-offs on the 2 March), and I can announce its candidates for the most man of the match performances so far this season:
This highlights the importance, according to TPOEM, of Santiago Cazorla to Arsenal’s season in terms of match-winning performances. Both Manchester sides and Arsenal lead the team man of the match awards with 22 apiece, the difference being that there is a much larger spread of players who have put in top performances for United and City in the league.
Those readers who follow me on twitter will have noticed that TPOEM liked the value of the chances of a home win for Everton and draws for Swansea vs Newcastle and Manchester United vs Norwich. Please note that this isn’t a direct match result prediction for the above – TPOEM actually had all 3 as odds-on for home wins, but the probability of a draw when compared to quoted bookmakers odds before 3pm seemed attractive at the time.
The main problem I had was in finding an efficient way to input all the line-ups in time for kick-off!
As it was, I completed my efforts and placed bets on all the 3pm kick-offs by 3.25pm – something I will have to work on going forward.
In addition to the above bets, of which only Everton’s home win against Reading paid off, I bet on a draw for Sunderland-Fulham (profit) an away win for West Ham (profit) and a win for QPR. 2 of these bets were actually placed live, with the scores at 0-0, whilst QPR were already 1-0 up at Southampton when I took the gamble of backing them to win. According to TPOEM, Chelsea were massive favourites at home to West Brom so I decided not to bother with a gamble on that game.
Most pleasing was the away win of West Ham at Stoke – a game which I am sure could just as easily have gone either way. When I ran the line-ups through TPOEM West Ham had actually already made 2 early substutions so I incorporated those new players into the line-up. The model indicated about a 30% chance of West Ham winning which was attractive enough when compared to quoted odds of about 9/4. Fortunately for the early prospects of TPOEM they duly achieved an unlikely result at the Brittania.
I will continue to test TPOEM’s predictive efficacy vs bookmaker odds but for any followers of the blog, please note that I am seeking value not outright wins. Even if Manchester United are heavy favourites to win at home, as they were at the weekend, I may suggest another outcome if the odds are attractive enough depending on what my early-stage model tells me!