Minutes Big Software on the Run, December 04, 2015

Annibale Panichella, Arie van Deursen, Arnd Hartmanns, Bart Postma, Boudewijn van Dongen, Cong Liu, Gamze Tillem, Huub van de Wetering, Inald Lagendijk, Jack van Wijk, Jaco van der Pol, Nour Assy, Sicco Verwer, Vincent Bloemen, Wil van der Aalst, Zekeriya Erkin, Moritz Beller
Mariëlle Stoelinga, Maikel Leemans, Marieke Huisman

1) Welcome, brief discussion about action points meeting September 15 (Wil)

See slides.

Action points pending from last meeting:

[Action point (Jaco): Generate (synthetic) datasets for concurrent software settings and distribute them.]

[Action point (Huub): Get in touch with NetBeans to obtain data from error-reporting or usage.] Contacted but no response yet.

[Action point (Arie): Get in touch with LogStack / Log4J / Holistic search / Kibana.]

[Action point (Sicco): Setting up a software-mining competition.]

[Action point (Sicco): Data generated by app-fuzzing.]

[Action point (Arie): Provide pointers to interesting parties working on continuous deployment / DevOps.]

[Action point (Arie): Discuss with people working on GHTorrent.] In progress

[Action point (Mariëlle): Approach magazines like AG and Bits and Chips and try to get BSR articles in.]

[Action point (Jaco): Fill-in missing PhD position.]

Additional remark: The post docs are the universities nodes (Nour Assy from TU/e,Annibale Panichella from TUD and Arnd Hartmanns from UT). There should be more collaboration and communication between them to organize and push the collaborative work.

[Action point (Nour): continuously update the BSR website with new contents and latest news.]

[Action point (All): send your latest news related to BSR project (e.g. accepted articles, presentations at workshops/conferences, prototypes, etc.) to Nour to add them to the website.]

2) 20 minute presentation per university

TU/e - Mining and visualizing behavioral software processes (by Nour and Bart) See slides: The process mining group started a case study with ProM. They defined the required data type and the representation of the software process model with respect to the software engineering domain. They are currently working on proposing a discovery algorithm to mine an activity diagram. The visualization group started a top down approach. They are defining a taxonomy of visualization techniques for event data and exploring how software event data fit in this taxonomy.

TUD (by Arie, Annibale and Gamze): The security group is working on designing an infrastructure and privacy preserving protocols for data processing. The main challenges are (i) what to protect and how, (ii) could conformance checking be done without revealing all details and (iii) what type of conformance checking could be done with protected data.

UT (by Jaco) – Parallel checking and prediction: The formal methods and tools group is working on developing parallel algorithms for analyzing software process models. For prediction purpose, they are using quantitative models that depict non determinism, probability and time information. The current challenge is to use a standard to represent quantitative models (compared to XES standard for traces).

It is important to have a shared dataset on which all groups can work. The dataset generated by the process mining group and shared on GitHub could be used for a first case study.

3) Decide on winter school (Wil)

We aim at a winter school for one week in October or November 2016. We proposed to take 4 weeks window between 9 October and 6 November in order to check for overlapping conferences and events in this period.

[Action point (Nour): create a doodle to schedule the school week.]

[Action point (All): Check for overlapping conferences or events in this period and choose the available week in doodle before December 20th].

[Action point (All): send your suggestions of speakers’ names before December 20th].

[Action point (All): propose ideas and send to Wil before December 20th:

We look to put the school in Dutch programs (e.g. ASCI, SIKS, IPA) and to make it biennial.

[Action point (Inald): look for ASCI integration before December 20th.]

[Action point (Wil): look for SIKS integration before December 20th.]

[Action point (Arie): look for IPA integration before December 20th.]

We should also think about the educational learning objectives. We should explain the objectives for speakers and define the tracks in relation to their expertise. There should be ample time in the program for participants to work on concrete cases.

[Action point (Arie): define the tracks in relation to speakers’ interests before December 20th.]

[Action point (Arie and Wil): rank speakers and send invitations on January 15th].

[Action point (All): first call of participation without speakers’ names on January 15th].

[Action point (Wil): fix the program before March 1st].

[Action point (Annibale, Arnd and Nour): prepare posters and flyers before March 1st].

[Action point (All): send second call of participation on March 1st].

[Action point (All): each partner proposes an outline for his hands-on sessions before April 1st].

4) Decide on infrastructure (Boudewijn)

There is a budget of 100K for infrastructure. We have an offer from Dell for 70K and warranty for 3 years. The hosting is free in 2016 and will be on High Tech campus Eindhoven. Jaco suggests to go for at least 40 Gbit interconnect, not 10 Gbit.

We discussed the possibility of having an infrastructure connected to an existing one (DAS5). Therefore, the decision on infrastructure is postponed until the discussion with DAS5 people.

[Action point (Inald, Boudewijn): Meeting with DAS and have recommendation about hardware.]

[Action point (Boudewijn): send a proposal before 15th of January based on a good conversation with DAS people and decide on the person to manage the servers.]

5) Data generation and distribution

Data from eclipse foundation (Moritz): There is an opportunity to have a data set from eclipse foundation. The data are the java stack traces sent to eclipse. However, the sharing of this dataset may have restrictions and it is not clear how long it may take.

Dataset from ProM using RapidMiner (Cong): We have event logs generated from running ProM in two formats: CSV and XES. They are shared on GitHub with explanation slides. All groups can start working on them. In case you need further explanation or additional running cases, data information, etc., please contact Cong Liu.

[Action point (Nour): create a folder for commonly shared datasets on GitHub.]

[Action point (Cong): Share ProM dataset on GitHub.]

[Action point (All): Share your dataset on GitHub.]

[Action point (Ine, Nour): create a dropbox folder to share slides, minutes, documents, etc.]

[Action point (All): share the meetings’ minutes and slides in Dropbox.]

5) Planning

Next meeting of whole consortium will be on 23-24 March in Delft or Amersfoort. The PhD course day starts on 23 March at noon and continues in the morning of 24. The meeting starts on March 24 afternoon.

We agreed that PhDs and postdocs attend the PhD program and that the host institute decides on the content of topics. The goal is (i) to let participants understand the basics needed for this project and (ii) to understand what PhDs are working on.

[Action point (Arie): decide on the location Delft or Amersfoort before January 15.]

[Action point (Delft): provide the program before 1st of February.]