In the past decades, we have achieved great successes in machine learning to develop advanced and scalable data analytic algorithms. However, many machine learning practitioners are confronted with significant challenges in order to fully utilize those research outcomes. The plenary talk at International Machine Learning Conference (ICML 2012) discussed Machine Learning that Matters and raised the issue that there is a big gap between current machine learning research and the actual needs and use in practical contexts. The reasons could many folds, including lack of sufficient training on the basic concepts in machine learning, a steep learning curve to utilize existing tools, or the complex nature of practical machine learning involving parameter setting, cross-validation and combinations of algorithms such as ensemble methods. In addition, in the machine learning community, we ourselves are confronted with significant challenges, such as reproducibility of experiment results and robustness of the complex algorithms.
All the challenges are exacerbated in the big data applications. That is, handling complex analytic processes manually becomes impractical as the size of the data and the complexity and dynamic nature of the problem increase. Computational tools, such as workflows, and probabilistic programming provide an elegant solution for analyzing and solving increasingly data-intensive, complex machine learning problems. In particular, Scientific workflows allow scientists to represent complex analyses in a high-level declarative manner, manage large-scale computation, and capture data provenance. In fact, workflow management systems like myGrid/Taverna,Wings/Pegasus, Kepler, Chimera, etc., are becoming increasingly popular for both specifying and executing such data-intensive analyses. Repositories like myExperiment and CrowdLabs are emerging as a way for domain experts to share their full end-to-end workflows, as well as workflow fragments, with other users.
This workshop will provide an exciting opportunity for researchers and practitioners to share their experiences and cutting edge research in using scientific workflows for machine learning applications.
Details of the Call For Papers is under the Call For Papers link above. For more information, please use the other links on the menu at the top. For any questions, please contact Ricky Sethi at rickys@sethi.org.