What “50 Years of Data Science” leaves out

29th April 2017 in London at CodeNode

There are 20 other SkillsCasts available from Data Science Festival 2017

Please log in to watch this conference skillscast.

632665631 640

We’re told data science is the key to unlocking the value in big data, but nobody seems to agree just what it is. Is it engineering, statistics. . .both? David Donoho’s “50 Years of Data Science”, which is itself a survey of Tukey’s “Future of Data Analysis”, will present you with one of the best criticisms of the hype around data science from a statistics perspective, arguing that data science is not new (if it’s anything at all) and calling statistics to action (again) to take back the field with a more practical, modern view of what it means to teach statistics and data science.

Drawing on his blog post, Sean Owen responds, offering counterpoints from an engineer, in search of a better understanding of how to teach and practice data science in 2017. You will explore some key points in the history of data science from the past 50 years in order to build up a more complete view of how data science sprung out of statistics and merged with computer engineering. Finally, you will discover Donoho’s view of what it means to build data science capability with one taken from the experience organizations doing so in the context of Apache Hadoop, Spark, and other big data tools.


What “50 Years of Data Science” leaves out

Sean Owen

Sean is Director of Data Science at Cloudera in London. Before Cloudera, he founded Myrrix Ltd (now, the Oryx project) to commercialize large-scale real-time recommender systems on Apache Hadoop. He is an Apache Spark committer and co-authored Advanced Analytics on Spark. He was a committer and VP for Apache Mahout, and co-author of Mahout in Action. Previously, Sean was a senior engineer at Google.