Data Science technical proficiencies
Languages: Python, R
Deep Learning: Keras, Tensorflow, gensim
Google stack: BigQuery, Compute Engine, ML Engine, Google NLP API.
Visualisation: Tableau, Shiny, Plotly, ggplot, matplotlib/seaborn.
Agile Data Science: Uses an agile data science framework (JIRA, Kanban), and an advocate of reproducible research (GitHub, Markdown, Jupyter notebooks, Shiny).
At TMG, my primary focus has been working on Natural Language Processing (NLP) projects: after all, textual content is the primary product of a newspaper! We have been particularly focussed on building a better classification framework, which suits the breadth of our unique content far better than third-party providers possibly could.
After constructing a new taxonomy of Telegraph content (having worked with stakeholders across the business to do so), I then built multi-class deep learning text classifiers using Python, Keras and Tensorflow: these classification models were then deployed in Google ML Engine, and are currently used in production.
Currently, I am working on recommendation systems for use by our Editorial team, using a number of different framework (deep learning, topic modelling, Google NLP). This is a project in collaboration between Editorial, Architecture, Data Science and Engineering teams across The Telegraph: this is scheduled to be up and running during Q1 2018.
Aside from these long-term NLP projects, we also work on short-term ad-hoc projects at TMG. For example, I was deeply involved with the Telegraph’s UK 2017 General Election coverage, working with the Data Journalism team. To this end, I created a probabilistic model to predict the General Election result: it was wildly successful, as we got the predictions correct within a few seats of the actual distributions by party.
Aside from my work at TMG, I have been Lead Data Science Instructor at General Assembly in London since 2016. I have taught a number of their 10-week Introduction To Data Science courses: bringing students from an introduction to Python to building predictive models and visualisations by the end of the course.
Previously (in academia and elsewhere)..
Before all this, I was an astrophysicist: after getting a Ph.D. from University College Dublin in 2003, I spent a decade doing post-doctoral work in the United States and the United Kingdom. From 2008 until 2014, I worked on the European Space Agency’s Herschel Space Observatory, as part of the team responsible for the SPIRE instrument. On the good days when I actually had time to do research (running a satellite instrument is a very time consuming pursuit), my main area of investigation was looking at the effects of massive stars in other galaxies on their surrounding environment: hence the (old) name of the blog (plus the domain name). I also taught a number of undergraduate physics and astrophysics courses in Ireland, the US and the UK.
I then left academia to join eFinancialCareers – a financial services job site based in London – as their first data scientist hire., and helped build their data science capabilities/team up from scratch.
At eFC, our main focus was to understand what our clients were looking for in their next financial services role. To this end, I spent much of my time building NLP models to perform text classification of our user-supplied content. Additionally, we wanted to know what our users are doing on an event basis on our site: we built up our customer analytic capabilities in order to do this, and to predict what sort of role they’d like to see (and apply for).
Still love astronomy (though as an amateur these days), and a bit of a spaceflight geek.
Superb at pub quizzes! Been known to do them competitively.
QPR fan, too, for my sins. I can be found in the Upper Loft at Loftus Road, on occasion.