2nd International Workshop on
BIG Data Software Engineering

Austin, Texas, USA – May 16, 2016

Collocated with ICSE 2016

Big Data is about extracting valuable information from data to use it in intelligent ways such as to revolutionize decision-making in businesses, science and society.

Big Data analytics is able to handle data volume (large data sets), velocity (data arriving at high frequency), variety (heterogeneous and unstructured data) and veracity (data uncertainty) – the so called four Vs of Big Data. Research on software analytics and mining software repositories has delivered promising results mainly focusing on data volume. However, novel opportunities may arise when leveraging the remaining three Vs of Big Data. Examples include using streaming data (velocity), such as monitoring data from services and things, and combining a broad range of heterogeneous data sources (variety) to take decisions about dynamic software adaptation.

BIGDSE’16 aims to explore opportunities that Big Data technology offers to software engineering, both in research and practice (“big data for software engineering”). In addition, BIGDSE’16 will look at the software engineering challenges imposed by building Big Data software systems (“software engineering for big data”).

Call for Papers

BIGDSE’16 seeks contributions of different types, including theoretical foundations, practical techniques, empirical studies, experience, and lessons learned.

Potential and relevant research directions that BIGDSE’16 plans to explore include, but are not limited to:

  • Big Data for run-time monitoring and adaptation of software systems. Big Data taps into the wealth of online data available during the operation of software systems. Monitoring of services, things, cloud infrastructures, users, etc. will deliver an unprecedented range of information, which is available with low latency. Such real-time data offers novel opportunities for real-time planning and decision making and thus supports new directions for software adaptation. As an example, based on changes in user profiles Big Data techniques may deliver actionable insights on which concrete adaptation actions to perform to respond to those changes.
  • Big Data for software quality assurance and diagnosis. Software analytics, i.e., the use of automated analysis of software artefacts, has been explored for some time. Now, with the significant increase of data volumes as well as analytics capabilities for large volumes of structured and unstructured data, software analytics faces new opportunities in the Big Data area. As an example, monitoring logs of complex systems may easily reach sizes of gigabytes and terabytes in small periods of time. Failure patterns and deviations thus may require Big Data analytics to handle such massive amounts of log data. As an example, deep learning techniques may be applied for performing root cause analysis of software failures.
  • Software architectures and languages for Big Data. NoSQL and MapReduce are predominant when it comes to efficient storage, representation and query of Big Data. However, apart from large, long-standing batch jobs, many Big Data queries involve small, short and increasingly interactive jobs. To support such kinds of jobs may require new architectures and languages that, for instance, combine classical RDBMS techniques for storage and querying on top of NoSQL and MapReduce paradigms. In addition, as we get more big data stores, we also get more CPUs. So, analytics solutions that were computationally impossible 10 years ago are now becoming possible. Ultimately, this may lead to a new generation of software architectures and languages that optimise Big Data querying and retrieval.
  • Quality and cost-benefit of Big Data software. Assuring the quality of Big Data software requires adopting and extending proven quality assurance techniques from software engineering. For example, testing Big Data software may require new ways of generating “test” data that is sufficient and representative. However, due to the size of data, exhaustive testing may quickly become infeasible thus requiring (formal) verification techniques to generate assurances for Big Data software. Further, not all data sources may be relevant for a big data analysis task. However, as these data sources often come attached with some cost (e.g., queries may need to be run across distributed data pools), the cost-benefit of Big Data software should be assessed a-priori and not only as an after-thought.
  • Curriculum for Big Data. One emerging area of concern in practice is the lack of skilled Big Data experts, which develop, deploy and exploit techniques, processes, tools and methods for developing applications that actually turn Big Data into helpful insights. With a particular focus on Big Data software engineering, BIDGSE’16 invites contributions that provide a critically view on how software engineering curricula may be extended to deliver such experts.

Paper Submission

BIGDSE’16 invites authors to submit any of the two kinds of workshop papers:

  • Full papers with 7 pages maximum
  • Position papers with 4 pages maximum

Workshop papers must follow the ICSE 2016 Format and Submission Guidelines.

Paper submission site:

Accepted papers will be published as an ICSE 2016 Workshop Proceedings in the ACM and IEEE Digital Libraries.

The official publication date of the Workshop Proceedings is the date the proceedings are made available in the ACM Digital Library. This date may be up to two weeks prior to the first day of ICSE 2016. The official publication date affects the deadline for any patent filings related to published work.

Important Dates

Paper submission:Jan. 29, 2016 (EXTENDED)
Notification of authors:Feb. 19, 2016
Camera-ready copies:Feb. 29, 2016 (UPDATED)


9:00 – 10:30 Keynote & Paper Session 1 (Real-time Data Analytics)

10:30 – 11:00 Morning break

11:00 – 12:30 Paper Session 2 (Data Management)

12:30 – 14:00 Lunch break

14:00 – 15:30 Panel & Paper Session 3 (Engineering Data-intensive Systems)

15:30 – 16:00 Afternoon break

16:00 – 17:30 Paper Session 4 (Applications and Industry)


Organising Committee

  • Luciano Baresi, Politecnico di Milano, IT
  • Tim Menzies, North Carolina State Univ., US
  • Andreas Metzger, Univ. of Duisburg-Essen, DE
  • Thomas Zimmermann, Microsoft Research, US

Program Committee

  • Kenneth Anderson, U Colorado Boulder, US
  • Titus Barik, North Carolina State U, US
  • Olga Baysal, U Montréal, CA
  • Ayse Basar Bener, Ryerson U, CA
  • Hong-Mei Chen, U Hawaii at Manoa, US
  • Ed Curry, INSIGHT, IE
  • Bojan Cukic, West Virginia U, US
  • Massimiliano Di Penta, U Sannio, IT
  • Jörg Dörr, Fraunhofer IESE, DE
  • Fabiana Fournier, IBM, IL
  • Roger Kilian Kehr, Huawei, DE
  • Michele Lanza, U Lugano, CH
  • Philipp Leitner, U Zurich, CH
  • Grace Lewis, SEI, US
  • Mehdi Mirakhorli, Rochester Institute of Technology, US
  • Nazim Madhavji, U Western Ontario, CA
  • Audris Mockus, U Tennessee, US
  • Meiyappan Nagappan, Queens U, CA
  • Tien Nguyen, Iowa State U, US
  • Bernhard Schätz, TU Munich, DE
  • Flavio Villanustre, LexisNexis Risk Solutions, US

Previous Edition


The photos used on this website are licensed under the "Creative Commons Attribution-Share Alike 3.0 Unported" and "Creative Commons Attribution 2.0 Generic" license. ("Overview of Florence from Campanile di Giotto", "Florence at night seen from the Piazzale Michelangelo")