1st International Workshop on
BIG Data Software Engineering

Firenze, Italy – May 23, 2015

Collocated with ICSE 2015

Theme and Goals

Big Data is about extracting valuable information from data in order to use it in intelligent ways such as to revolutionize decision-making in businesses, science and society.

Big Data analytics is able to handle data volume (large data sets), velocity (data arriving at high frequency), variety (heterogeneous and unstructured data) and veracity (data uncertainty) – the so called four Vs of Big Data. Research on software analytics and mining software repositories has delivered promising results mainly focusing on data volume. However, novel opportunities may arise when leveraging the remaining three Vs of Big Data. Examples include using streaming data (velocity), such as monitoring data from services and things, and combining a broad range of heterogeneous data sources (variety) to take decisions about dynamic software adaptation.

BIGDSE’15 aims to explore such kinds of opportunities that Big Data technology offers to software engineering, both in practice as well as research. In addition, BIGDSE’15 will look at the challenges imposed by building Big Data software systems.

Overall, BIGDSE’15 will feature contributions and discussions that push the link between Big Data and software engineering, as well as critically look into issues such as cost-benefit of big data.

Call for Papers

Software engineering as a research discipline has been challenged, since its inception, with collecting and analyzing empirical evidence – be it about people, processes or artifacts – to develop principles, models and theories. In recent years we have seen strong interest and efforts devoted to evidence-based approaches to theory building. We are now at a crossroads where we have available an unprecedented amount of data that is available in real-time and from a multitude of sources. Complementing this trend in data availability is the emergence of novel and improved analytics algorithms and tools (such as deep learning) that allows us to distil actionable insights for software adaptation, evolution and quality. For software engineering, similar to other disciplines in science and economics, the aforementioned developments may lead to radical new ways and unprecedented opportunities of attacking problems.

Big Data software systems (aka. data-intensive software systems), represent an emerging class of software systems that challenges existing software engineering principles, methods and tools due to the sheer size and real-time processing of data.

The impact that the aforementioned kinds of opportunities and challenges will have on software engineering are of relevance to BIGDSE’15.

Topics (non-exclusive)

BIGDSE’15 seeks contributions of different types, including theoretical foundations, practical techniques, empirical studies, experience, and lessons learned.

Potential and relevant research directions that BIGDSE’15 plans to explore include, but are not limited to:

  • Big Data for run-time monitoring and adaptation of software systems. Big Data taps into the wealth of online data available during the operation of software systems. Monitoring of services, things, cloud infrastructures, users, etc. will deliver an unprecedented range of information, which is available with low latency. Such real-time data offers novel opportunities for real-time planning and decision making and thus supports new directions for software adaptation. As an example, based on changes in user profiles Big Data techniques may deliver actionable insights on which concrete adaptation actions to perform to respond to those changes.
  • Big Data for software quality assurance and diagnosis. Software analytics, i.e., the use of automated analysis of software artefacts, has been explored for some time. Now, with the significant increase of data volumes as well as analytics capabilities for large volumes of structured and unstructured data, software analytics faces new opportunities in the Big Data area. As an example, monitoring logs of complex systems may easily reach sizes of gigabytes and terabytes in small periods of time. Failure patterns and deviations thus may require Big Data analytics to handle such massive amounts of log data. As an example, deep learning techniques may be applied for performing root cause analysis of software failures.
  • Software architectures and languages for Big Data. NoSQL and MapReduce are predominant when it comes to efficient storage, representation and query of Big Data. However, apart from large, long-standing batch jobs, many Big Data queries involve small, short and increasingly interactive jobs. To support such kinds of jobs may require new architectures and languages that, for instance, combine classical RDBMS techniques for storage and querying on top of NoSQL and MapReduce paradigms. In addition, as we get more big data stores, we also get more CPUs. So, analytics solutions that were computationally impossible 10 years ago are now becoming possible. Ultimately, this may lead to a new generation of software architectures and languages that optimise Big Data querying and retrieval.
  • Quality and cost-benefit of Big Data software. Assuring the quality of Big Data software requires adopting and extending proven quality assurance techniques from software engineering. For example, testing Big Data software may require new ways of generating “test” data that is sufficient and representative. However, due to the size of data, exhaustive testing may quickly become infeasible thus requiring (formal) verification techniques to generate assurances for Big Data software. Further, not all data sources may be relevant for a big data analysis task. However, as these data sources often come attached with some cost (e.g., queries may need to be run across distributed data pools), the cost-benefit of Big Data software should be assessed a-priori and not only as an after-thought.

Paper Submission

BIGDSE’15 invites authors to submit any of the two kinds of workshop papers:

  • Full papers with 7 pages maximum
  • Position papers with 4 pages maximum

Workshop papers must follow the ICSE 2015 Format and Submission Guidelines.

Paper submission will be handled through EasyChair:


Accepted papers will be published in the ICSE 2015 electronic conference proceedings and in the digital libraries of ACM and the IEEE Computer Society.


9:00 – 9:15 Welcome

9:15 – 10:15 Industrial Keynote

  • Flavio Villanustre (Leader of HPCC Systems and VP Technology for LexisNexis Risk Solutions): Industrial Big Data Analytics: Lessons from the Trenches(cancelled due to delayed flight)

10:15 – 10:30 Introduction Round

10:30 – 11:00 Morning break

11:00 – 12:30 Paper Session 1: “Processes and Methods” (session chair: Olga Baysal)

12:30 – 14:00 Lunch break

14:00 – 15:30 Paper Session 2: “Opportunities and Challenges” (session chair: Rick Kazman)

15:30 – 16:30 Afternoon break

16:30 – 17:30 Paper Session 3: “Industry and Practice” (session chair: Mehdi Mirakhorli)

17:30 – 18:00 Wrap-up and BIGDSE 2016

BIGDSE'15 Results and Discussions on Twitter


Important Dates

Paper submission (extended):Jan. 30, 2015
Notification of acceptance:Feb. 18, 2015
Camera-ready copies:Feb. 27, 2015
Workshop:May 23, 2015

Organising Committee

  • Luciano Baresi, Politecnico di Milano, IT
  • Tim Menzies, North Carolina State Univ., US
  • Andreas Metzger, Univ. of Duisburg-Essen, DE
  • Thomas Zimmermann, Microsoft Research, US

Program Committee

  • Olga Baysal, U Montréal, CA
  • Edward Curry, DERI, IE
  • Massimiliano Di Penta, U Sannio, IT
  • Jörg Dörr, Fraunhofer IESE, DE
  • Fabiana Fournier, IBM, IL
  • Daniela Grigori, U Paris Dauphine, FR
  • Roger Kilian Kehr, Huawei, DE
  • Michele Lanza, U Lugano, CH
  • Philipp Leitner, U Zurich, CH
  • Grace Lewis, SEI, US
  • Jordi Marco, UP Catalunya, ES
  • Mehdi Mirakhorli, Rochester Institute of Technology, US
  • Audris Mockus, U Tennessee, US
  • Emerson Murphy-Hill, North Carolina State U, US
  • Meiyappan Nagappan, Queens U, CA
  • Tien Nguyen, Iowa State U, US
  • Bernhard Schätz, TU Munich, DE

The photos used on this website are licensed under the "Creative Commons Attribution-Share Alike 3.0 Unported" and "Creative Commons Attribution 2.0 Generic" license. ("Overview of Florence from Campanile di Giotto", "Florence at night seen from the Piazzale Michelangelo")