Digital Science Studio (DSS) “is a software platform that aggregates all the steps and big data tools necessary to get from raw data to production ready applications.”
In other words, it is an application with a fancy graphical interface that makes it easy to perform analytics and visualizations on data – big and small. Think R language, but with a beautiful Web UI. DSS is not free software, but there’s a community edition with slightly limited features that anybody may download. There’s also a 14-day free trial of the enterprise version.
A few days ago I decided to download the community edition and install it on my main computer, which is powered by Fedora 20 KDE. Installation went well, but the official documentation does not state that DSS needs to have Nginx running to function. It took about 10 minutes before I figured that out and installed Nginx from the Fedora repo. Nginx, by the way, is a Web server, just like Apache.
The Web UI of DSS is pretty easy to get used to, so I spent the better part of the afternoon messing with data with help from the official tutorials. Everything was going according to script until I wanted to view a Flow. In DSS-speak, a Flow lets you see the connection between the input and output datasets for a project, like the one shown in this screenshot.
But when I tried to view the Flow for my first project, I got the error shown in this screenshot. The text of the error was: Cannot run program “dot”: error=2, No such file or directory HTTP code: 500, type: java.io.IOException
Nothing was being written to the Nginx error log that pertained to that error, so I could get any help there. I sent an email to DSS support email and got an automated reply that said to expect a response in about 24 hours. That was on Saturday. I’m still waiting for that response. Meanwhile, a little snooping gave me the solution to the problem. I needed to install a package called Graphviz, a free software visualization software released under the Eclipse Public License (EPL).
So that’s what I did, using this command:
<strong>yum install graphviz</strong>. On Debian/Ubuntu-based distributions, it is available under the same name, so it can be installed using
<strong>apt-get install graphviz</strong>.
So there you have it. A nice and short tutorial on a solution to the “HTTP code: 500, type: java.io.IOException” error on DSS. If you want to take DSS out for a spin, you need to have Nginx and graphviz installed. If you’re a Data Scientist and are looking for a simple application to use, give DSS community edition a try. DSS is able to handle data from several data sources, including files, SQL and NoSQL databases and even Hadoop. Sorry, Windows folks. DSS community edition is only available for Linux. You may download it from http://www.dataiku.com/dss/communityedition/.