Building Better Data Science Tools
(while avoiding a real world job!)






Tomas Petricek, Alan Turing Institute + fsharpWorks
@tomaspetricek | tomasp.net | fsharpworks.com

Programming languages and tools


Surely, there must be a better way!

Change how we think about the world?

Nice open-source community!


DATA SCIENCE

DATA ACCESS

Understanding
the world

F# Data

Access JSON, XML, CSV & more

R Provider

Call thousands of stats packages

DATA ANALYSIS

US government debt

Deedle

Analyze time-series and data frames

XPlot

Wrapper over Plot.ly and Google Charts

DATA JOURNALISM

Welcome to the post-fact world


Technology democratized opinions

Can it also democratize facts?


Data-driven storytelling

Open data-driven storytelling

  • Can the result be reproduced?
    Reinhart–Rogoff, Growth in a Time of Debt

  • Is the visualization misleading?
    One medal per boat or 8 medals per boat?

The Gamma

OLYMPICS

Visualizing medalists

(thegamma.net)

Simple data journalism tools


Millions of spreadsheet programmers!

What is the simplest language?

Is programming the new literacy?


OLYMPICS

Choosing disciplines

(load demo)

Code vs user interface


Treat article as a program

Use program analysis research!

How theory helps in practice


Make facts great again!

SUMMARY

Computer science


Can be biology, journalism, mathematics

Enough demand to get a lot of freedom

What is fun and also matters?


Open-source software


Not 1960s hippies culture anymore

Look for new interesting technologies!

Meet interesting people & get a job too.


Thank you!




Tomas Petricek

tomasp.net | @tomaspetricek | tomas@tomasp.net