The R vs. Python language war is dead. This is an observation from Strata + Hadoop World San Jose 2016. There were no discussion among participants about the merits of one over the other. Nor was there any content about which is better in any of the sessions that I attended. In a show of acceptance of using either language for Data Science, a full-day tutorial was held for each of the two languages.

What has instead emerged is acceptance that Python is the more general purpose of the two while now also being well suited for Data Science. And that R is the statistical-domain specific of the two while also being well suited for Data Science.

What’s emerged is that the technical challenges underlying integration of these languages into Big Data are essentially the same. A key post by software engineer Wes McKinney discusses the the commonality. It’s an important post. Read it here.

The language war is dead. A takeaway is that it’s not one or the other but both. Data Scientists will need to know both. Being more fluent in Python is better. Having enough facility in R to get data into and out of the R ecosystem, being able to use and interpret results from statistical tests, and being able to use the visualization libraries, is probably enough.

Incidentally, the search interest in R is stable now for the last three years:

The pandas-datareader package is not included as part of the Anaconda distribution. (At least not yet as of the most current distribution on the date of this post.) In line 6 of the strata_pandas.ipynb used in the pyData tutorial, an error occurs. If you want to use the features in this module in a script or Juypter notebook, you need to first install the package. Here’s the conda command to download and install the package:

conda install -c https://conda.anaconda.org/anaconda pandas-datareader

If you execute the command conda list at the Bash prompt before and after the install, you’ll see that the package is not there at first, then is present after the install.

For more info including changes: https://anaconda.org/anaconda/pandas-datareader

I created a Note to self category after seeing this one in person and realizing how big a problem of perceptions or brand image it could create. If you are setting up to make a pitch, get your video hooked and make sure it’s displaying through the projector…then turn down the brightness until you’re up.

Here’s an example of what I mean. The speaker who is standing is handling Q&A for the pitch for his startup. Behind him on the screen is the logo and opening slide for the startup that the next speaker will be pitching on. Why is this a problem? Well, it can vary. In this case, for the speaker who is wrapping up Q&A, he is standing next to not the logo for his startup, but for a startup whose mission it is to be the online marketplace for the wholesale distribution of regulated cannabis!

Set up display, then turn brightness down until it's your turn to pitch.
Set up display, then turn brightness down until it’s your turn to pitch.

Set up your display, then turn brightness down until it’s your turn to pitch. I caught this at the March 21, 2016, pyCon Startup Row Pitch Night in Seattle. I’d never noticed this before. It occurred for each of the transitions in pitch talks that evening.

Final note, ReUP was one of two startups selected that evening to represent Seattle at the upcoming pyCon event in Portland. Congrats!