DataTaunew | comments | leaders | submitlogin
Ask DataTau: How do you present your data science findings at work?
17 points by winterflower 2380 days ago | 11 comments
For those of us working on data science problems in industry, how do you present your findings to your supervisor/employer/boss. I did some experiments using IPython/Jupyter notebooks, but I have now reverted to plain old Word docs and powerpoints. Any tips/suggestions/own experiences welcome.

5 points by patientfrog 2380 days ago | link

Good question-- for me it definitely depends on the audience.

IPython notebooks are great for technical audiences who might want to see some code, but for management I typically extract plots from the notebooks and put them into slides.

I always feel it is easier my message to engineers/data scientists vs management though. That is, I feel like "plots in slides" are not powerful enough to tell my story all the time. It's not that management doesn't know enough stats or whatever to understand what is happening. It's more like if I show something like a correlation between x and y with z confidence, they might come back with questions like, "...interesting, but what does that mean?" So I have to be a little more creative with my messaging.


2 points by almeria 2380 days ago | link


In response to '...but what does that mean?', I try to anticipate it and deploy the magic phrase, "...and here's what that means".

When managers review work, there's the 'If you give a mouse a cookie...' problem. In that stage, quick and dirty responses work well to prevent over-producing on idle curiosities. Rough syntax and output speeds the review cycle, and helps minimizes throwaway work.


3 points by alexaindredumas 2379 days ago | link

I mostly use RMarkdown, compiled to a pdf or html, sometimes slides. All with the aid of RStudio. As already mentioned knowing the audience is essential, more technical stuff goes to the appendix as sometimes is too much work to write one report per audience. In case you really want to look at the code I store the r project in github for all the technical details.


3 points by almeria 2380 days ago | link

Many ways, depending on intent and audience.

Primarily, a standard departmental template to show conclusions on the problem studied. This is the medium also used to share analysis findings and discussion outside of the group.

When the results will be requested again in the future with a updated source data: R markdown documents.

When many other people have to do something with the information, or multiple dimensions/interactivity is key: Tableau.

Informally and to support internal function: regular meetings/updates including printed graphs, projector screens & pasted output annotated for discussion.


2 points by TheCartographer 2378 days ago | link

I am fortunate in that I work with a bunch of old school scientists who generally favor pen-and-paper approaches over anything digital and don't mind the cost of office supplies.

So my usual answer to this question (for basic exploratory analyses with project teams) is to use R to make extremely high resolution 36x42in (or whatever) plots of the data and then print them using our map plotter and hang them on the wall. This allows us to quickly draw and annotate on them when we are working with the data. It also helps overcome powerpoint's little resolution problem (and more generally, the lack of resolution inherent to most business class projectors).

If I'm showing lots of high resolution plots, I generally favor using something like windows picture viewer over PowerPoint, because I can use one high resolution graph and zoom in the relevant sections as I discuss.

I've been playing around with things like the threejs plugin for QGIS and the dygraphs package for R to generate self-contained, dynamic, and/or 3D visualizations for management. This gives a "wow" factor and keeps management from falling asleep during a data heavy presentation - and the dygraphs or threejs maps can be recycled to my project teams for their casual use.

The resolution problem is one I struggle with frequently. I want to show 10 years of time series data at 12 locations at once and highlight the relationships between time series - doing that with enough detail and clarity to be useful is a big issue for me. The usual answer is either: really big high-red plot (maintains "connections" in the data but can be difficult to interpret and use during a presentation) or lots of lower resolution plots that window in on certain interesting phenomenon (easier to interpret but chops up the data and implies separate, rather than interconnect, phenomenon)

I wish I was more versed in creating dynamic visualizations in dygraphs and the like, but becoming so would require me to drop everything and learn Java, HTML and CSS - something I have been refusing to do on the grounds of preseving what little sanity I have left....


2 points by searine 2380 days ago | link

R / Excel > Adobe Illustrator > PDF/Slides


2 points by roycoding 2380 days ago | link

I have been playing with exporting ipython notebooks to create reveal.js slideshows.


1 point by Cartin1234 2379 days ago | link

Wow... anyone using python here? All answers are with R.


1 point by Roon 2379 days ago | link

Right now, my data science duties are incidental to my primary role of doing application support for my company's Spotfire user community, so almost anything I do which other people are going to see is done in Spotfire. If it's more complex code, I might do a proof of concept in RStudio first.


1 point by LittlePeter 2379 days ago | link

R Shiny


1 point by ubercode5 2379 days ago | link

Today we use OBIEE for our visualization layer. (I know it's ugly, but that's a political battle above my pay grade.) The users of these dashboards are primarily non-technical or management.


RSS | Announcements