Tag Archives: vincent kompany

Model pitfalls and further discussion of TPOEM

Since my previous post introducing a new model for football analysis, TPOEM, I have developed and integrated some significant improvements to it.

Firstly the speed in which I can give predictions based on team starting line-up (involving less manual input, more automation) is much better, so last Saturday I was able to tweet about the model’s predictions well before the 3pm kick-offs began.

Secondly I have added a manager/leadership factor into the analysis which is dynamic and unique to each team.  This adjustment is intended to ‘smooth’ the team level aggregate scores that TPOEM calculates, where the model would not otherwise capture a persistent difference between a team’s results and their underying scores. This offsets (albeit not completely) the difference between the model’s league table compared to the actual league table. Why does that happen? Well, the basic underlying reason is the same as why a shots on goal league table does not reflect the real league table. I attribute this to a kind of quality factor that I am not picking up in the statistics I use: quality in terms of shooting can relate to the position on the pitch of a shot, whether defenders pressured the attacker and how much of a contribution the assist added to a goal scored. This quality factor will also incorporate a team’s record at home or away. For reference, the model currently seems to think that Stoke and Norwich are outperforming particularly well whilst Wigan, Southampton and QPR are all doing worse in the league than TPOEM suggests they should be doing. That might be due to luck, team playing style, management, player leadership, quality or all of the above. The model should now be slightly better at accounting for that.

Predicting part 2

So the first week of predicting using TPOEM brought me a net proft, although my biggest win was West Ham away win vs Stoke – and I’ve already explained that the model was distinctly anti-Stoke before the most recent update!

Again, as ever, I am seeking value so even if TPOEM suggests a probability of an event win/draw/loss of about 40%, if the bookmakers quote odds of 35% then I consider it an attractive bet. As it stands I haven’t been that selective about what I bet on: in fact so far I’ve been betting on every match that I ran the model for even though in many cases the model didn’t really suggest any particular value vs bookies.

The result this week, from 5 games, was another net profit, this time +26% return (it was +56% last time). But that came from 2 wins, 1 void, 2 lost bets, so in a sense the net result was neutral.  I profited overall because I weighted my bets towards the most attractive in terms of value – the biggest win being a draw-no-bet backing Everton at home to Man City. The model really liked Everton’s chances mostly because Kompany, Aguero and Yaya Touré were all missing for Man City.

I also backed draw-no-bets for Liverpool, Villa and Stoke: lost, won, void respectively. And lastly I went with a draw for Swansea-Arsenal (lost) but in retrospect I shouldn’t have bothered with that bet because the model gave no conclusive direction for the game and the odds weren’t good either.

As I reformat the model’s data and find a better way of communicating its predictions/results I will publish more information on the blog as I recognise I have kept most of the details pretty close to home so far. When I’m at my desk for the 3pm kick-offs I will also tweet about the model’s predictions so if you’re interested look out for that but if you bet then you are doing so at your own risk!!!


Premier League 2011-12: Position Analysis CB

By now, some of the features of my analyses are starting to become consistent. As before, I have filtered out any players with fewer than 1000mins playing time in the position of centre back. This means any player with postion ids 5 or 6 (always centre back regardless of whether or not the team plays a back 3,4 or 5). Then I also added players who played position 4 in a defensive 3 or 5. The most notable absence from this list is Nemanja Vidic who only managed 502mins playing time last season.

Next I reviewed strong correlations between wins/draws/losses, goals for and goals conceded against this shortened list of players. I sorted the list by the strongest negative and positive correlations to try to ascertain the key contributing attributes of central defenders to winning games. As ever, several passing fields showed up (all of whom also have strong cross-correlations), so I have been selective in which passing fields I kept and which I removed in order to reduce the bias to teams that either pass much more than average or whose central defenders have more work to do. Any fields that I was able to use a success rate ratio (eg. tackles lost vs tackles won) I did, otherwise I generally used a rate per minute measure.

Much like my previous system for full backs, I split the players into sectors and gave them points (1-6, worst to best) depending on how good they were relative to their peers. Then I added/deducted bonus points. In a similar vein to the full backs analysis, the bonus points cover goals scored, assists, errors leading to goals, penalties conceded and red cards.

Compared to full backs, heading statistics are much more important for centre backs. Headed clearances and aerial duels both had relatively significant correlations to winning games – and not only defensively, but about 2/3 of the goals scored by central defenders (40 out of 62) came from headers. As a result I used both headed clearances and aerial duels in my model, even though they are probably closely correlated. Similarly, ground duels, tackle success and challenges lost may also all be closely linked, but I considered this such an important part of central defender’s role that I included all three – thereby artificially increasing the weighting to those fields.

The final scores are below for all 54 central defenders analysed by this system, with something of a surprise at the top!

Yes, according to my system, Clint Hill was the best centre back last season. Having spent the first part of the season in the Championship (on loan at Nottingham Forest) he came back into favour under Mark Hughes’ reign but still only just played more than 1000mins to qualify for this list (1080mins to be exact). Say what you want about Clint Hill, but he was certainly consistent across most areas, dominating clearances, aerial duels and tackles. Touches per minute and passing success were relative weak points to his game but in a struggling QPR team that is probably not much of a surprise – it seems as though when he did touch the ball, he cleared it! UBT, if you are wondering, stands for ‘unsuccessful ball touch’, another area in which Hill does well.

For those of you doubting my system due to its unconventional winner, the rest of the top 10 (or so) may comfort you, particularly as Kompany and Vermaelen are joint second on 42pts. Kompany didn’t clear the ball anywhere near as frequently as Hill, or indeed most other centre backs, but he excelled in every other area.

By including both clearances and headed clearances I have effectively doubled the contribution of 2 fields which are very closely linked and also related to the team in which the defender plays – this is food for thought if I revisit this rating system in the future, considering that both Kompany and Vermaelen were certainly disadvantaged by this.

For a Newcastle fan, it is sad to see Coloccini sitting rock bottom of the list, particularly as his performances won so much praise in leading Newcastle into 5th place last season. Mike Williamson, his partner for much of the season, was only a few places above in 49th position. Also, England’s centre back pairing at Euro 2012, John Terry and Joleon Lescott (14th and 25th respectively) underperformed several other English defenders – including Rio Ferdinand who was controversially left out of the squad.