Category Archives: Data blog

Data: How will England perform in the World Cup this summer?

Is there a correlation between clubs’ Champions’ League performances and their nations’ F.I.F.A. rankings?

Long has the debate raged about whether the strength of England’s domestic league harms the national team’s performance or, conversely, whether German clubs’ recent renaissance has been caused in part by the resurgence of the German national team.

Yet few attempts have been made to examine this link statistically. In this post, I will look into whether there is a correlation between the performances of a nation’s clubs in international competition and the performances of a nation itself, in the form of its F.I.F.A. ranking, to see whether this season’s Champions’ League can give us a clue to the outcome of the summer’s World Cup.

To read more of this story, visit this post on my Tumblr account (used because Tumblr can host interactive graphics, whereas my version of WordPress cannot). 

Work experience in Trinity Mirror’s data unit

Earlier this month, I spent a week working under David Ottewell in Trinity Mirror’s data unit.

The data unit, formed only last year, comes up with various data-based article ideas for the different Trinity Mirror regional titles, which it presents in an internal bulletin. The regional titles then choose to use the findings as and when they wish.

I was given several interesting projects throughout the week, the most important of which was my research into MPs’ expenses. (The data on expenses can be found here.)

This led to a story on Wales Online:

I also did some research into how consistent each Premier League manager’s team selections have been this year. (The idea for the article came from someone at Wales Online, who had an inkling that Ole Gunnar Solskjær’s lack of success so far as Cardiff City manager might have been caused by his frequent tinkering with team selections.)

There was significant interest in this research from different Trinity Mirror titles. First, it appeared in the Manchester Evening News, but it also appeared on Wales Online and it even featured on the Mirror’s national site.

Some of the work performed by the data unit is long-term preparation: for instance, I helped to compile a spreadsheet of national teams’ World Cup records, ahead of the FIFA World Cup in June.

Much of the work, however, came simply from the day-to-day inspiration of the team members. For instance, David Dubas-Fisher had the idea of comparing the levels of support of football teams in the Championship this season; David Ottewell then suggested that we expand this research to a comparison of away support and home support (as teams’ home support can be influenced by various factors like ticket prices and the clubs’ catchment areas).

The result was that Burnley came out on top, to the delight of celebrity fan Alastair Campbell, whose retweet of the research led to much debate of our findings:

My final piece of work during this enjoyable week was to produce interactive maps of dentists listed on the NHS website as currently accepting new adult patients. I created these for different Trinity Mirror regions, as we thought they would be of use to readers. One of the maps, of Liverpool’s available dentists, is below*.

Liverpool NHS dentists

Note: click on the map to be taken to an interactive version.

*If you wish to see the other maps – for Birmingham, Coventry, Newcastle and Middlesbrough – let me know by commenting at the bottom of this article.

All in all, it was an excellent week, and one in which I learnt a lot about data journalism. I am grateful to David for giving me the opportunity.

For advice on getting work experience in journalism, have a look at the section on Wannabe Hacks.

Jess Denham, one of last year’s Interhacktives, interviewed David Ottewell last year. Read the interview here.

Premier League spreads out: dominance of North West and London has lessened

Norwich’s decision to reimburse fans who travelled the long road to Swansea last fortnight, only to see their team limp to a 3-0 defeat, got me thinking about the locations of clubs in the Premier League.

I remembered that, only a few years ago, hardly any clubs based outside the football heartlands of the North West and London were in the Premier League. As you can see from the map below, the concentration in these two corners of the country in the 2010-11 season was quite startling.

Premier League 2010-11 clubs' location

Note: click on the map to be taken to an interactive version.

Yet this seems to have changed of late, providing more diversity for Premier League away fans. The map below shows the locations of clubs in the 2013-14 season.

Premier League 2013-14 clubs' location

Note: click on the map to be taken to an interactive version.

The promotions of Southampton, Norwich and Hull, along with those of the two Welsh clubs, have contributed to a more diverse Premier League.

Of course, the fact that these teams have been only recently promoted means it should be no surprise that the country’s best-performing clubs remain in the North West and London. For the sake of more interesting away days, however, it is preferable to have a more even spread of clubs around the country – and, with Leicester City (of the East Midlands) already set to join the league next season, it looks like the concentration of a few years ago has gone for good.

Data map: London has England and Wales’ top five multicultural areas

With much hyperbole surrounding the national debate about immigration and Britain’s increasingly multicultural society, it is useful to inject some facts into the discussion.

And, as you might expect, London dominates when it comes to England and Wales’ most multicultural areas.

Location White residents
Newham 29%
Brent 36%
Harrow 42%
Redbridge 43%
Tower Hamlets 45%
Slough 46%
Ealing 49%
Leicester 51%
Hounslow 51%
Waltham Forest 52%

Top of the table is Newham, only 29% of whose residents are white. This compares with the Isles of Scilly at the other end, where a whopping 99% of residents are white.

As the map below shows, the areas where the proportion of white residents is at its lowest are, without exception, urban areas.

White residents map

Click the map to be taken to an interactive version.

Disclaimer: unfortunately, the geocodes for St. Albans were unavailable in the dataset I downloaded from the Office for National Statistics. I assure you, however, that its figure was unexceptional: 88% of its residents are white.

Are you surprised by this result? Is there any way I could have improved this story? Please let me know in the comments section below.

Data analysis: The relationship between English language proficiency and economic performance

Looking on the ONS website today, I found a plethora of data – drawn from the 2011 Census – on the relationship between proficiency with the English language and various aspects of economic performance.

I chose, using Tableizer, to focus on the data that looked at the difference between linguistic ability and economic activity. The headline figure is unsurprising: those with a poor level of English are, in general, less likely to be economically active (only 47% of those in this category were economically active, compared with the overall figure of 63%*).

Linguistic ability Total Economically active %
All 45,496,780 28,818,355 63%
Native 41,820,374 26,455,028 63%
Good English 2,891,769 1,996,782 69%
Poor English 784,637 366,545 47%

Jargon buster: It is important to point out that ‘economically active’ does not mean simply ‘in employment’; it means either ‘in work’ or ‘looking for work’ – so it excludes, for example, stay-at-home parents and pensioners.

So, why could this be? (It couldn’t be because most poor English speakers are children, because this data set is for residents of England and Wales over the age of 16.)

Thankfully, the ONS provides a breakdown of the data, so we can look at what is making those with poor English ability economically inactive.

Ability Total Retired Student Looking after family Sick / disabled
All 45,496,780 9,713,808 2,397,348 1,796,520 1,783,292
Native 41,820,374 9,422,213 2,021,824 1,462,558 1,658,503
Good English 2,891,769 165,346 342,153 210,531 63,323
Poor English 784,637 126,249 33,371 123,431 61,466

Translated into percentages:

Ability Total Retired Student Looking after family Sick / disabled
All 45,496,780 21% 5% 4% 4%
Native 41,820,374 23% 5% 3% 4%
Good English 2,891,769 6% 12% 7% 2%
Poor English 784,637 16% 4% 16% 8%

(This isn’t quite the full total of economically inactive people – there was also a column for ‘Other’, but it’s quite difficult to analyse such a vague grouping.)

The table shows that the main differences between those with poor English ability and the average is that, proportionally, many more of them – four times the average (16% versus 4%) – are engaged in looking after family, and twice the average (8% versus 4%) are sick / disabled.

There are many potential reasons for this, such as that those with poor English ability might come from more traditional or religious backgrounds, in which it is rare for both parents to work. So it is instructive to note that, although far fewer people with poor English ability are economically active than the average, there might be good reasons for this – and we ought not to be distracted by the headline figure.

There are other points of interest in this data set – such as why those with good English ability are more economically active than native speakers (69% versus 63%) – but I’ll save those for another time.

*Note: The terms in the tables are not official ONS terms; I translated them to cut through the jargon.

The data used here refer to residents of England and Wales over the age of 16, and can be found here.

UK unemployment map

In our data journalism class with John Burn-Murdoch today, we were taught how to use cartodb.com – a useful online tool for creating interactive maps out of spreadsheet data.

We downloaded data on unemployment in each local authority district in the UK from Nomis Web, and then downloaded mapping data from the Office for National Statistics’ geoportal.

We then merged the data, uploaded the result to cartodb.com – and hey presto:

Unemployment in the UK by local authority district

Unemployment in the UK by local authority district

If you click the above, you’ll be taken through to the interactive map that’ll allow you to zoom in and see the specific numbers.

Bear in mind, when looking at the map, that the percentages refer only to people on the claimant count – and so might be lower than you would’ve expected.