Time in Eastern Time (US & Canada).
June 09, 2023
| 8:30 – 9:00 | Breakfast (—) |
| 9:00 – 9:15 |
SEMLA 2022 Opening Session Opening by Prof. Foutse Khomh Welcome message by Annie Ross (Deputy Vice-President, Research, Polytechnique Montréal) |
| 9:15-10:45 |
Keynote Session I: Operationalizing Trustworthy Large-Scale AI – Moderator: Ying (Jenny) Zou Lionel C. Briand (University of Ottawa, University of Lexembourg): Quality Assurance of AI-enabled Systems. Miryung Kim (University of California, Los Angeles): SE4AI–Lessons Learned from Designing SE Methods for Big Data and HW Heterogeneity (Slides) Walid Maalej (University of Hamburg): Tailoring Requirements Engineering for Responsible AI |
| 10:45-11:00 | Coffee break |
| 11:00-12:00 |
Research Panel Discussion: Operationalizing Trustworthy Large-Scale AI – Moderator: Ying (Jenny) Zou Lionel Briand (University of Ottawa, University of Lexembourg) Miryung Kim (University of California, Los Angeles) Walid Maalej (University of Hamburg) |
| 12:00-13:30 | Lunch break |
| 13:30-14:00 14:00-15:30 |
Raymond Li (ServiceNow Research): BigCode: Open and Responsible Development of Large Language Models for Code. (Slides) Ahmed Haj Yahmed and Rached Bouchoucha (Polytechnique Montréal): Tutorial: Debugging Deep Reinforcement Learning. |
| 15:30-16:00 | Coffee break |
| 16:00-17:30 |
Session: Trustworthy Large Language Models and ML Documentation – Moderator: Mohammad Hamdaqa Jin Guo (McGill University): Machine Learning Documentation – How Far Away Are We Su Lin Blodgett (Microsoft Research): Examining How We Examine Language Technologies Sarath Chandar (Polytechnique Montréal, MILA): Towards Interpretable and Bias-free Large Language Models |
| 18:05-20:00 | Reception & Poster session — Poster chair: Mohammad Hamdaqa |
June 10, 2023
| 8:30 – 9:00 | Breakfast |
| 9:00-10:30 |
Keynote Session II: Industry View on Trustworthy Large-Scale AI — Moderator: Foutse Khomh Sumit Gulwani (Microsoft Research): Leveraging LLMs as Analogical Reasoning Engines to enhance Programming-by-Example experiences (Slides) Maryam Ahmadi (Canadian National Railway, BrainStation): Root cause analysis of system’s event logs Ahmed E. Hassan (Queen’s University): Foundation Models and Software Engineering in the Beyond Moore Computing Era |
| 10:30-11:00 | Coffee break |
| 11:00–12:00 |
Industry Panel Discussion: Industry View on Trustworthy Large-Scale AI — Moderator: Maxime Lamothe Ahmed E. Hassan (Queen’s University) Sumit Gulwani (Microsoft Research) Thomas Reid (Sycodal) Patrick Mesana (HEC Montréal, National Bank of Canada) |
| 12:00-13:30 | Lunch break |
| 12:15-13:15 | SEMLA Members Closed Meeting |
| 13:30-15:00 |
Special Panel: Ethical and Legal Implications of LLMs for Code – Moderator: Bram Adams Sumit Gulwani (Microsoft Research) Fouse Khomh (Polytechnique Montréal) Valentin Callipel (Laboratoire de cyberjustice, Université de Montréal) Joé T. Martineau (HEC Montréal) |
| 15:00-15:30 | Coffee break |
| 15:50-17:00 |
Session: Software Engineering and AI — Moderator: Heng Li Ettore Merlo (Polytechnique Montréal): Out-of-distribution Analysis and Robustness of Deep Neural Networks (Slides) Yuan Tian (Queen’s University): Optimizing Software Project Management with AI-powered Tracking Tools Fatemeh Hendijani Fard (University of British Columbia): Exploiting The Learned Knowledge of Language Models Using Adapters (Slides) |
| 17:00-17:30 | Closing and Best Poster Awards |


















Abstract: Anomaly detection plays an important role in management of modern large-scale distributed systems. Logs, which record system runtime information, are widely used for anomaly detection. However, Unsupervised anomaly detection algorithms face challenges in addressing complex systems, which generate vast amounts of multivariate time series data. Timely anomaly detection is crucial for managing these systems effectively and minimizing downtime. This proactive approach minimizes system downtime and plays a vital role in incident management for large-scale systems. To address these challenges, a method called Multi-Scale Convolutional Recurrent Encoder-Decoder (MSCRED) has been developed for detecting anomalies in CN PTC system logs. MSCRED leverages the power of multivariate time series data to perform anomaly detection and diagnosis. It creates multi-scale signature matrices that capture different levels of system statuses across various time steps. The method utilizes a convolutional encoder to capture inter-sensor correlations and a Convolutional Long-Short Term Memory (ConvLSTM) network with attention mechanisms to capture temporal patterns.
Abstract
Abstract: Language models such as RoBERTa, CodeBERT, and GraphCodeBERT have gotten much attention in the past three years for various Software Engineering tasks. Though these models are proven to have state-of-the-art performance for many SE tasks, such as code summarization, they often require to be fully fine-tuned for the downstream task. Is there a better way for fine-tuning these models that require training fewer parameters? Can we impose new information on the current models without pre-training them again? How do these models perform for different programming languages, especially low-resource ones with less training data available? How can we use the knowledge learned from other programming languages to improve the performance of low-resource languages? This talk will review a series of experiments and our contributions to answering these questions.












