Seminars

Upcoming seminars

The new 2022 MINERVA seminar cycle will start in January.
Program, abstracts and videos will be available bellow with all the previous seminars.


Planning

DateSpeakerTitle
TBDStéphane IlicMachine learning-infused cluster cosmology
TBDMario D’Amore (TBC)Unsupervised classification of Mercury’S Visible–Near-Infrared. MASCS/MESSENGER reflectance spectra for automated surface mapping.
TBDHannah T. Rüdisser (TBC)Automatic detection of interplanetary coronal mass ejections in in-situ solar wind data

TBD – Stéphane Ilic
Title: Machine learning-infused cluster cosmology

Watch it live on YouTube: TBD

Abstract:
Clusters of galaxies are powerful probes for testing our cosmological paradigm, especially dark energy models. However, a few persistent obstacles prevent us from using clusters to their full potential, such as a robust way of identifying them in galaxy surveys, or an accurate estimation of their total mass. With their meteoric rise in the field of astronomy for the past decade, one may wonder if modern machine-learning algorithms could be used to help with and improve upon each step of the cosmological exploitation of clusters of galaxies, from their detection to the determination of their characteristics.
In this work, we explored how the third iteration of the detection-oriented neural network « You Only Look Once » (YOLOv3, Redmon & Farhadi 2018) could be tweaked and used to detect clusters of galaxies inside images of galaxy surveys. In order to train our network, we use images of the Sloan Digital Sky Survey (SDSS) and galaxy clusters previously identified by the redMaPPer algorithm (Rykoff et al. 2014). Our promising results pave the way for the use of machine learning in cluster cosmology with the upcoming generation of large-scale galaxy surveys (LSST, Euclid).

TBD – Mario D’Amore (TBC)
Title: Unsupervised classification of Mercury’S Visible–Near-Infrared. MASCS/MESSENGER reflectance spectra for automated surface mapping.

Watch it live on YouTube: TBD

Abstract:
The surface of Mercury has been mapped in the 400–1145 nm wavelength
range by the Mercury Atmospheric and Surface Composition Spectrometer
(MASCS) instrument during orbital observations by the MESSENGER
spacecraft. Under the hypothesis that surface compositional information
can be efficiently derived from spectral reflectance measurements with
the use of machine learning techniques, we have conducted unsupervised
hierarchical clustering analyses to identify and characterize spectral
units from MASCS observations. We apply our analysis on the latest
MESENGER data delivery to PDS including the new spectral photometric
correction , finding result consistent with our previous analysis based
on our custom photometric effect removal.
This work is published in the book Machine Learning for Planetary
Science (Elsevier, 2022; Editors: J. Helbert, M. D’Amore, M. Aye, H.
Kerner) and funded by the European Union’s Horizon 2020 grant (No
871149) in the Europlanet 2024 RI project.

TBD – Hannah T. Rüdisser (TBC)
Title: Automatic detection of interplanetary coronal mass ejections in in-situ solar wind data

Watch it live on YouTube: TBD

Abstract:
Interplanetary coronal mass ejections (ICMEs) are one of the main drivers for space weather
disturbances. In the past, different approaches have been used to automatically detect
events in existing time series resulting from solar wind in situ observations. However,
accurate and fast detection still remains a challenge when facing the large amount of data
from different instruments. Machine learning has been successfully applied in various fields.
We present an approach based on deep learning to automatically detect ICMEs in solar wind
data and discuss the results. Furthermore, we give an outlook on the application of machine
learning in space weather forecast.


Previous seminars (click titles to expand)

2022 cycle

Tues. 24th, May 2022 – Frédéric Courbin :
Finding strong lenses with ML in Euclid and LSST

Did you miss it ? You can watch it again on YouTube:


Abstract:

Strong gravitational lensing is the only way to weigh galaxies and their components over a broad mass range. After reviewing some of the most exciting astrophysical and cosmological applications of strong lensing, I will show how galaxy-scale strong lenses can be found in surveys such as Euclid and Rubin-LSST with deep learning. DES and UNIONS are used as a test bench for such searches and show that machine learning, while extremely effective in finding lenses, still has limitations that need to be overcome.

Tues. 17th, May 2022 – Gilles Louppe :
Simulation-based inference: proceed with caution!

Did you miss it ? You can watch it again on YouTube:


Abstract:

Simulation-based inference (SBI) enables approximate Bayesian inference when high-fidelity computer simulators are used to describe phenomena of interest. SBI algorithms have quickly developed and matured over the past years and are already in use across many domains of science such as particle physics, astronomy, cosmology, neuroscience or epidemiology. Inference, however, remains only approximate, and any downstream analysis depends on the trustworthiness of the approximate posterior. In this talk, we will review the Bayesian inference methodology in the likelihood-free setting and discuss good practices to follow, including the choice of the prior, the choice of the SBI algorithm, and diagnostics that we can use to validate inference results.

Tues. 5th, April 2022 – Fiona Porter :
Quantifying uncertainty in deep learning classification of radio galaxies

Did you miss it ? You can watch it again on YouTube:


Abstract:

The volume of data produced by upcoming next-generation radio surveys such as the SKA will be large enough that machine learning is the only practical way to identify and classify new sources from their observations. While methods such as convolutional neural networks (CNNs) are well-established as being able to perform image classification for astronomy, the problem remains that it is possible for a CNN to make predictions which are both very confident and completely wrong. Robust uncertainty measures are therefore necessary to ensure that our models can both correctly identify the sources they are intended to find and reliably flag out-of-distribution sources. In this talk, I’ll discuss the different types of uncertainty associated with machine learning models, how they relate to uncertainty from a human perspective, and how they can be calculated. Using the example of a dataset of Fanaroff-Riley galaxies, I’ll also show the merits of using different uncertainty metrics to attempt to flag potentially misclassified or out-of-distribution sources.

Tues. 22nd, March 2022 – Alan Heavens :
Field-level inference of cosmic shear: using ‘all’ the data with a Bayesian Hierarchical Model

Did you miss it ? You can watch it again on YouTube:


Abstract:

As cosmological survey data get more plentiful, error bars shrink, and we have to worry about ever more subtle effects, and even question whether the traditional methods of data analysis are fit for purpose. As soon as one questions the latter (as a Bayesian), one is taken on a logical path that results in a principled Bayesian solution – a Bayesian Hierarchical Model, where one infers not only the cosmological parameters of interest, but also the entire field of density fluctuations at early times. This is of course challenging, requiring the inference of millions of parameters (or more). In this talk I’ll show two approaches to this problem: Almanac, which characterises the statistics of fluctuations in a cosmology-independent way, and BORG-WL, a powerful extension of the BORG hierarchical framework, developed for weak lensing, which samples cosmological parameters as well as the field. As an example of the power of the approach, we have shown that BORG-WL can reduce the error on the matter density by a factor of 5 when compared with traditional two-point summary statistics, when based on the same underlying data.

Tues. 15th, March 2022 – Ingo Waldmann :
Deep learning in exoplanet characterisation

Did you miss it ? You can watch it again on YouTube:


Abstract:

The use of machine and deep learning is prevalent in many fields of science and industry and is now becoming more widespread in extrasolar planet and solar system sciences. Deep learning holds many potential advantages when it comes to modelling highly non-linear data, as well as speed improvements when compared to traditional analysis and modelling techniques. However, their often ‘black box’ nature and unintuitive decision processes, are a key hurdle to their broader adoption. In this seminar, I will give an overview of deep learning approaches used in exoplanet characterisation and discuss our recent work on developing Explainable AI (XAI) approaches. XAI is a rapidly developing field in machine learning and aims to make ‘black box’ models interpretable. By understanding how different neural net architectures learn to interpret atmospheric spectra, we can derive more robust prediction uncertainties as well as map information content as function of wavelength. As data and model complexities are bound to increase dramatically with the advent of JWST and ELT measurements, robust and interpretable deep learning models will become valuable tools in our data analysis repertoire.

Tues. 8th, March 2022 – T. Lucas Makinen :
The Essence of the Cosmos: extracting the information content of cosmological fields for compression and field-level simulation-based inference

Did you miss it ? You can watch it again on YouTube:


Abstract:

How much cosmological information is embedded in large-scale structure, and can we extract it?  Modern cosmological surveys aim to capture rich images or « fields » of evolving cosmic structure but are often too massive to be interrogated pixel-by-pixel at the field level. We demonstrate that simulation-based compression and inference can be equivalent to all-pixel field likelihoods. We compare simulation-based inference with maximally-informative summary statistics compressed via Information Maximising Neural Networks (IMNNs) to exact field likelihoods. We find that a) summaries obtained from convolutional neural network compression do not lose information and therefore saturate the known field information content, b) simulation-based inference using these maximally informative nonlinear summaries recovers nearly losslessly the exact posteriors of field-level inference, bypassing the need to determine or invert covariance matrices, or assume gaussian summary statistics, and c) even for this simple example, implicit, simulation-based likelihood incurs a much smaller computational cost than inference with an explicit likelihood. This work uses a new IMNNs implementation in 𝙹𝚊𝚡 that can take advantage of fully-differentiable simulation and inference pipeline. We further highlight extensions of this pipeline to cases where the cosmological field information is not known a priori, such as in full N-body gravitational and hydrodynamical simulations.

Tues. 1st, March 2022 – Michelle Lochner :
Anomaly Detection in Astronomical Data using Machine Learning

Did you miss it ? You can watch it again on YouTube:


Abstract:

The next generation of telescopes such as the SKA and the Vera C. Rubin Observatory will produce enormous data sets, far too large for traditional analysis techniques. Machine learning has proven invaluable in handling large data volumes and automating many tasks traditionally done by human scientists. In this talk, I will discuss how machine learning for anomaly detection can help automate the process of locating unusual astronomical objects in large datasets thus enabling new cosmic discoveries. I will introduce Astronomaly, a general purpose framework for anomaly detection in astronomical data using active learning.

Tues. 22nd, February 2022 – Niall Jeffrey :
Single frequency CMB B-mode inference with realistic foregrounds from a single training image

Did you miss it ? You can watch it again on YouTube:


Abstract:

With a single training image and using wavelet phase harmonic augmentation, we present polarized Cosmic Microwave Background (CMB) foreground marginalization in a high-dimensional likelihood-free (Bayesian) framework. We demonstrate robust foreground removal using only a single frequency of simulated data for a BICEP-like sky patch. Using Moment Networks we estimate the pixel-level posterior probability for the underlying {𝐸, 𝐵} signal and validate the statistical model with a quantile- type test using the estimated marginal posterior moments.This work validates such an approach in the most difficult limiting case: pixel-level, noise-free, highly non-Gaussian dust foregrounds with a single training image at a single frequency. These results enable robust likelihood-free, simulation-based parameter and model inference for primordial B-mode detection using observed CMB polarization data.

Tues. 15th, February 2022 – Anaïs Berthereau :
Using deep learning to detect radio frequency interferences in pulsar observations

Did you miss it ? You can watch it again on YouTube:


Abstract:

Pulsars are fast-rotating and highly-magnetized stars producing beams of radio emission above their magnetic poles. Because the magnetic axis is not aligned with the spin axis, the beams are swept across the sky during the rotation of the star, thus creating periodic pulses for remote observers. To study pulsars, we record Times of Arrival (TOAs) of periodic radio pulses at a telescope and use the TOAs to estimate the parameters of the pulsar and its environment with high precision. However, radio observations of pulsars are impacted by different perturbating signals called radio frequency interferences (RFIs). The method we will present performs RFI excision from pulsar observations using a deep learning algorithm. We compare our results to the standard statistical method Coast Guard and the impact of the new RFI excision algorithm on TOAs.

Tues. 8th, February 2022 – Benjamin Wandelt :
Machine Learning for Cosmology and Beyond

Did you miss it ? You can watch it again on YouTube:


Abstract:

After more than a decade of eager anticipation data from several major cosmological surveys will begin to arrive during the next three years. These surveys will produce a volume of data that will be orders of magnitude larger than all cosmological data ever collected before. These data will contain information that has the potential to revolutionize our understanding of questions about the nature of dark energy, dark matter, and the initial conditions of the universe, with potentially wide-ranging implications for fundamental physics. Unfortunately, we only know how to access a small fraction of this information. Even now, with already available data sets, we are not limited by the statistical power of the data but by the shortcomings of traditional data analysis approaches. Together with our international partners we have developed methods to unlock vastly more information from cosmological data sets. In addition to working with an explicit likelihood these methods allow defining the likelihood implicitly through simulations, sometimes called likelihood-free, or simulation-based inference. I will discuss the ingredients that make this work: a combination of novel deep learning methods (such as information maximizing neural networks, moment networks, neural physical engines), sophisticated physical modeling, simulations, and statistical approaches (e.g. for constructing optimal estimators, massive data compression, and variance reduction). In addition, we have generated large corpuses of state-of-the-art cosmological simulations of structure formation in the universe that we can use to train machine learning algorithms to extract the useful information. Current work focuses on advancing the state of the art in cosmological data analysis by building on these innovative methods. We need to explore high-dimensional parameter space with only a restricted number of costly simulations of the formation of cosmic structures and galaxy formation to guide us. We are working to quantify for the first time the full cosmological information content of the dark matter and astronomical tracers. To do so we need to address the challenges that arise when analyzing data sets where the information in encoded in non-linear ways and there are multiple confounding factors in real data. My talk will present an overview with multiple examples. In the coming weeks, Minerva talks by my collaborators Niall Jeffrey and Lucas Makinen will explore some of the most powerful techniques in more depth.

Tues. 1st, February 2022 – Fransicso Villaescusa Navarro :
The CAMELS project

Did you miss it ? You can watch it again on YouTube:


Abstract:

In this seminar I will present the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project, whose main goal of the project is to provide theoretical predictions for observables as a function of cosmology and astrophysics by combining thousands of state-of-the-art cosmological hydrodynamic simulations with machine learning. I will first introduce the simulations and their characteristics. Next, I will present a few results from the CAMELS collaboration, such as the finding of a universal relation between subhalo properties, the usage of convolutional neural networks to marginalize over astrophysics effects at the field level, a constrain on the mass of the Milky Way and Andromeda using graph neural networks, and the prospects of inferring cosmological parameters with a single galaxy.

Tues. 25th, January 2022 – Torsten Enßlin :
Imaging with Information Field Theory

Did you miss it ? You can watch it again on YouTube:


Abstract:

Turning the raw data of an instrument into high-fidelity pictures of the Universe is a central theme in astronomy. Information field theory (IFT) describes probabilistic image reconstruction from incomplete and noisy data exploiting all available information. Astronomical applications of IFT are galactic tomography, gamma- and radio- astronomical imaging, and the analysis of cosmic microwave background data. This talk introduces into the basic ideas of IFT, highlights its astronomical applications, and explains its relation with contemporary artificial intelligence.

Link to the NIFTy code.

Tues. 18th, January 2022 – Sara Webb :
Fast Flares in the Galaxy, and the unsupervised machine learning used to find them

Did you miss it ? You can watch it again on YouTube:


Abstract:

A large number of signal recovery problems are not well-posed -if not ill-posed such as unsupervised component separation- that require extra regularization to be tackled. In this context, the ability to inject physical knowledge is of utmost importance to design effective regularization schemes. However, most physically relevant models are generally non-linear: signals generally lie on an unknown low-dimensional manifolds structure, which needs to be learnt. This is however quite challenging when available training samples are scarce. For that purpose, we introduce a novel approach that builds upon learning a non-linear interpolatory scheme from examples. We show how it allows to build efficient non-linear regularizations to tackle linear inverse problems. This will be illustrated in with two applications: semi-blind spectral unmixing in gamma-ray spectroscopy and semi-blind component separation from X-ray multispectral images from the Chandra telescope.

Tues. 11th, January 2022 – Jérôme Bobin :
Non-linear interpolation learning for example-based inverse problem regularization, with applications in physics and astrophysics

Did you miss it ? You can watch it again on YouTube:


Abstract:

A large number of signal recovery problems are not well-posed -if not ill-posed such as unsupervised component separation- that require extra regularization to be tackled. In this context, the ability to inject physical knowledge is of utmost importance to design effective regularization schemes. However, most physically relevant models are generally non-linear: signals generally lie on an unknown low-dimensional manifolds structure, which needs to be learnt. This is however quite challenging when available training samples are scarce. For that purpose, we introduce a novel approach that builds upon learning a non-linear interpolatory scheme from examples. We show how it allows to build efficient non-linear regularizations to tackle linear inverse problems. This will be illustrated in with two applications: semi-blind spectral unmixing in gamma-ray spectroscopy and semi-blind component separation from X-ray multispectral images from the Chandra telescope.

20202021 cycle

Tues. 15th, June 2021 – Laure Ciesla :
Recovering galaxies’ star formation history using machine learning

Did you miss it ? You can watch it again on YouTube:


Abstract:

Although it is now admitted that galaxies spend the majority of their lives on the so-called main sequence of star-forming galaxies, it is expected that they undergo variations in their star formation activity. These variations, on a timescale of 0.1-1Gyr, are difficult to constrain from broad band SED fitting due to degeneracies and spectroscopic information is required to provide accuretaly recover their history. However, these spectroscopic data are not always available (1% of the galaxies of the whole COSMOS field for instance). To extract the maximum information from broad band photometry and recover the last few hundreds Myr of galaxies’ history, we developped a method based on Approximate Bayesian Computation combined with machine learning techniques. I will show how this technique can be used to recover past properties of galaxies but also how it can be applied to different problematics. Furthermore, from the litterature, I will present other approaches to derive the star formation history of galaxies using different machine learning methods.

Tues. 8th, June 2021 – Joana Frontera :
Robust anomaly detection for hyperspectral imaging

Did you miss it ? You can watch it again on YouTube:


Abstract:

Anomaly detectors aim at finding any pixel that is different from its surrounding normal background pixels. Most of the classical anomaly detection algorithms are based on the Mahalanobis distance, and therefore, they are mainly sensitive to the signal energy. One could project the hyperspectral datacube onto the unit hypersphere in order to enhance detection for faint targets. In this context, we introduce the class of Angular Gaussian distributions for hyperspectral data modelling. Moreover, the corresponding maximum likelihood estimates and the generalized likelihood ratio test are both presented. The resulting anomaly detection scheme is independent on the true distribution of the data within the family of elliptical distributions.

Tues. 18th, May 2021 – Colin Burke :
A deep learning approach to galaxy deblending, classification, and detection

Did you miss it ? You can watch it again on YouTube:


Abstract:

Future deep and wide-area imaging surveys, such as LSST, require efficient and robust techniques for image processing. Beyond object detection and classification, the task of source separation (deblending) will be particularly challenging in this era. In LSST, the fraction of galaxy superpositions (blends) is expected to be up to 50%. In the field of computer vision, deblending can be thought of as the task of “amodal instance segmentation”. In this talk, I will discuss our application of a recently-developed deep learning approach called Mask R-CNN to this problem. Mask R-CNN is a general method for instance segmentation popular in the computer vision community. I will discuss the successes and challenges of applying these techniques to astronomical images. I will end with a brief outlook to the near-future of computer vision techniques in astronomical imaging.

Tues. 11th, May 2021 – Shirley Ho :
Extracting physical laws with deep learning

Did you miss it ? You can watch it again on YouTube:


Abstract:

I will discuss our new approach to extract symbolic representations of a learned deep model by introducing strong inductive biases. I will demonstrate our method with both simulated and real datasets.

We develop a general approach to distill symbolic representations of a learned deep model by introducing strong inductive biases. We focus on Graph Neural Networks (GNNs). The technique works as follows: we first encourage sparse latent representations when we train a GNN in a supervised setting, then we apply symbolic regression to components of the learned model to extract explicit physical relations. We find the correct known equations, including force laws and Hamiltonians, can be extracted from the neural network. We then apply our method to a non-trivial cosmology example -a detailed dark matter simulation- and discover a new analytic formula which can predict the concentration of dark matter from the mass distribution of nearby cosmic structures. The symbolic expressions extracted from the GNN using our technique also generalized to out-of-distribution data better than the GNN itself. Our approach offers alternative directions for interpreting neural networks and discovering novel physical principles from the representations they learn.

Materials can be found at: https://astroautomata.com/paper/symbolic-neural-nets/

Tues. 13th, April 2021 – Anais Moller :
Machine learning in time-domain astronomy & supernova cosmology

Did you miss it ? You can watch it again on YouTube:


Abstract:

In DES, we are working on a cosmological analysis with the whole five years of data using machine learning algorithms for supernova photometric classification. I will highlight key features of SuperNNova, an open source photometric classification framework that leverages the power of Recurrent Neural Networks and Bayesian Neural Networks. Looking towards the future, I will discuss the use of such photometric classifiers for VRO through the Fink broker. LSST is expected to detect 10,000 transient candidates every 30 seconds, within these alerts there will be a small fraction of promising transients. Fink is designed to use state-of-the-art machine learning methods to allow selecting promising transients for follow-up coordination while enabling diverse time-domain astronomy studies. I will discuss the possibility of using Bayesian neural Networks to deal with representativeness issues and out-of-distribution events.

Tues. 6th, April 2021 – Maxime Paillassa :
Robust detection of astronomical sources using convolutional neural networks

Did you miss it ? You can watch it again on YouTube:


Abstract:

Extracting reliable source catalogs from images is crucial for a broad range of astronomical research topics. However, the efficiency of current source detection methods becomes severely limited in crowded fields, or when images are contaminated by optical, electronic and environmental defects. In this talk I will present new methods to produce more robust and reliable source catalogs using convolutional neural networks (CNNs). First, I present MaxiMask, a CNN trained to automatically identify image defects in astronomical exposures. Then, I introduce a prototype of multi-scale CNN-based source detector robust to image defects, which is shown to outperform current state-of-the-art algorithms.

Tues. 30th, March 2021 – Stephane Aicardi :
Deep Learning on jovian decametric emissions

Did you miss it ? You can watch it again on YouTube:


Abstract:

I will present a attempt to detect and localize jovian decametric emissions observed by the Nançay decameter array. After a brief presentation of the physical phenomenon and some reminders about convolutional neural networks, I will present the training and validation sets, the network used and the achievements so far. I will end with a demo of some visualization tools I will use to validate the results.

Tues. 23rd, March 2021 – Anna Bonaldi :
The SKA science data challenges program

Did you miss it ? You can watch it again on YouTube:


Abstract:

In this talk I will describe the context and the motivation for running a series of science data challenges to prepare the community for the use of SKA data. I will present results of the first science data challenge (SDC1, Bonaldi et al. 2021) and I will describe the currently ongoing challenge, SDC2. I will finally present some examples of machine learning applications for the science data challenges.

Tues. 16th, March 2021 – Nathanael Perraudin :
DeepSphere: an almost equivariant graph-based spherical CNN

Did you miss it ? You can watch it again on YouTube:


Abstract:

Convolutional Neural Networks (CNNs) are a cornerstone of the Deep Learning toolbox and have led to many breakthroughs in Artificial Intelligence. These networks have mostly been developed for regular Euclidean domains such as those supporting images, audio, or video. Because of their success, CNN-based methods are becoming increasingly popular in Cosmology. Cosmological data often comes as spherical maps, which make the use of the traditional CNNs more complicated. The commonly used pixelization scheme for spherical maps is the Hierarchical Equal Area isoLatitude Pixelisation (HEALPix). We present a spherical CNN for analysis of full and partial HEALPix maps, which we call DeepSphere. The spherical CNN is constructed by representing the sphere as a graph. Graphs are versatile data structures that can act as a discrete representation of a continuous manifold. Using the graph-based representation, we define many of the standard CNN operations, such as convolution and pooling. With filters restricted to being radial, our convolutions are equivariant to rotation on the sphere, and DeepSphere can be made invariant or equivariant to rotation. This way, DeepSphere is a special case of a graph CNN, tailored to the HEALPix sampling of the sphere. This approach is computationally more efficient than using spherical harmonics to perform convolutions. We demonstrate the method on a classification problem of weak lensing mass maps from two cosmological models and compare the performance of the CNN with that of two baseline classifiers. The results show that the performance of DeepSphere is always superior or equal to both of these baselines. For high noise levels and for data covering only a smaller fraction of the sphere, DeepSphere achieves typically 10% better classification accuracy than those baselines. Finally, we show how learned filters can be visualized to introspect the neural network.

Tues. 9th, March 2021 – Stéphane Arnouts (LAM) :
The challenges of Photometric redshifts & cosmic web reconstruction with imaging surveys

Did you miss it ? You can watch it again on YouTube:


Abstract:

Understanding the origin of the accelerated expansion of the Universe, the formation of large scale structure and of the galaxies embedded in it are the quests of modern cosmology. Upcoming large imaging surveys will significantly contribute in those questions. But to fully exploit the statistical power of the photometric surveys, a key ingredient is the accuracy and reliability of photometric redshift estimates to enhance the prohibitively time-intensive spectroscopic approach. After a brief introduction on the interest in exploring galaxy evolution within the global context of the cosmic web, I will present our investigations on a photometric redshift method based on deep learning (convolutional neural network). I will present results based on the local survey (SDSS) and I will move to higher redshift ending with the HSC deep survey. I will describe the challenges that we face regarding the relevance of the training set available and potential way to overcome those limitations in a near future. I will briefly end by propagating the expected photo-z uncertainties in 2D cosmic web analysis.

Tues. 2nd, March 2021 – Hector Hortua :
Uncertainty Quantification in Deep Learning: Applications in Cosmology

Did you miss it ? You can watch it again on YouTube:


Abstract:

Deep neural networks have shown to learn effective predictors on a wide range of applications. However, as the standard approach is to train the network to optimize the cost function, the resultant model remains ignorant of its confidence. In this talk, we put our full focus on Bayesian neural networks along with Gaussian Processes and study a natural way to quantify uncertainty in Deep Learning. Finally, we will apply the results to some applications in Cosmology: CMB, and 21cm maps.

Tues. 9th, February 2021 – Antoine Marchal :
ROHSA: Regularized Optimization for Hyper-Spectral Analysis: Application to phase separation of 21 cm data.

Did you miss it ? You can watch it again on YouTube:


Abstract:

Star formation in galaxies is strongly linked to the physical processes that governthe evolution of the interstellar medium. Stars form by gravitational collapse of denseand cold structures in molecular clouds but the process that leads to the formation ofthese over-densities is still unclear. One key element seems to be related to theefficiency of the formation of the Cold Neutral Medium (CNM). Several studies have aimed at understanding the production of the CNM through thecondensation of the Warm Neutral Medium (WNM) in a turbulent and thermallyunstable flow using numerical simulations. In general, these studies indicate thepresence of a significant fraction of the mass being in the thermally unstable regime.However, the thermodynamical conditions of the gas remain largely unexplored fromthe observational point of view. To go further, and really compare with numericalsimulation that are, for now, under-constrained by observation, it is mandatory tomap the column density structure of each phase and study the spatial variations oftheir centroid velocity and velocity dispersion. This calls for methods that can extractthe information of each HI phase from fully sampled 21 cm emission data only. I will present ROHSA, an original Gaussian decomposition algorithm based on amulti-resolution process from coarse to fine grid using a regularized non-linearleast-square criterion to take into account simultaneously the spatial coherenceof the emission and the multiphase nature of the gas. This method allows us toinfer a spatially coherent vision of the three-phase neutral ISM.Using ROHSA, I will present observational results on the thermal and turbulentproperties of HI gas in the solar neighborhood and in the high velocity cloud Complex C.

Tues. 2nd, February 2021 – Rodrigo Ibata :
Learning the Dynamics of the Galaxy

Did you miss it ? You can watch it again on YouTube:


Abstract:

The Galactic halo is criss-crossed by long stellar streams that are probably the remnants of defunct globular clusters and dwarf galaxies. I will briefly present our recent discoveries of these structures with Gaia, before discussing how we are using them to probe the Galactic acceleration field. I will concentrate on a novel unsupervised machine-learning method that we have developed that fits the acceleration field while also learning the transformation from observed kinematic coordinates to action-angle variables. I suspect that this new method may have quite broad application throughout many areas of physics.

Tues. 26th, January 2021 – Aristide Doussot :
Supervised Learning methods for EoR parameter reconstruction

Did you miss it ? You can watch it again on YouTube:


Abstract:

Within the next few years, the Square Kilometer Array (SKA) or one of its pathfinders will hopefully provide a detection of the 21-cm signal fluctuations from the Epoch of Reionization (EoR). Then, the main goal will be to accurately constrain the underlying astrophysical parameters. Currently, this is mainly done with Bayesian inference using Markov Chain Monte Carlo sampling. Recently, studies using neural networks trained to performed inverse modelling have shown interesting results. We build on these by improving the accuracy of the predictions using neural network and exploring other supervised learning methods: the kernel and ridge regressions. We also present theoretical consideration about the learning sample and our study on how to optimize the information inside.

Tues. 12th, January 2021 – Marc Huertas-Comapny :
Is deep learning useful to understand the physics of galaxies?

Did you miss it ? You can watch it again on YouTube:


Abstract:

As available data grow in size and complexity, deep Learning (DL) has rapidly emerged as an appealing solution to address a variety of astrophysical problems. In my talk, I will review applications of both supervised and self-supervised DL to several galaxy formation related science cases, from basic low level data processing tasks to more advanced problems involving simulations and observations. I will try to emphasize success, failures and discuss promising research lines for the future.

Tues. 24th, November 2020 – Victor Bonjean (SISSA) :
Properties of matter in large scale structures with machine learning

Did you miss it ? You can watch it again on YouTube:


Abstract:

Studying the evolution and the composition of the largest structures of the Universe, e.g. galaxy clusters and cosmic filaments, is one of the most challenging research topic in cosmology, and will become even more challenging with the massive amount of data delivered in the near future by the forthcoming experiments (e.g. Vera Rubin, Euclid, SKA, SO). Since the beginning of my PhD, I have analysed publicly available multi-wavelength surveys (namely SDSS, WISE, Planck), using « new » techniques in data analysis such as machine learning or Bayesian methods, in order to study the baryonic matter (hot gas and galaxies) in the very faint environments of the Cosmic Web. I will present some methods, based on machine learning, that aim at understanding cosmic web environments, i) to probe the effect of the environment on galaxy evolution by estimating galaxy properties such as star formation rate and stellar mass with Random Forests for millions of sources, ii) by detecting the very faint hot gas emission through the Sunyaev-Zel’dovich effect in clusters and superclusters with deep learning, iii) by estimating galaxy cluster masses with new approaches, or iv) by performing autonomous multi-component separation to clean properly full-sky microwave data from foreground emissions. These concepts and developments are in line with the efforts needed in data analysis in order to be prepared for the future era of big data in astrophysics.

Tues. 10th, November 2020 – Emeric Bron (LERMA) :
Exploring the Orion B molecular cloud with machine learning methods: from data-driven to model-based approaches

Did you miss it ? You can watch it again on YouTube:


Abstract:

Giant Molecular Clouds (GMCs) are the birthplace of stars, and understanding the link between their internal structure (physical, chemical and dynamical) and their star formation efficiency remains a key astrophysical question. The ORION-B (Outstanding Radio-Imaging of OrioN B) IRAM-30m large program (PIs : M. Gerin, J. Pety, Pety et al. 2017) provides an exhaustive map of a GMC (over ~20pc with a 50mpc resolution) in the full 100GHz band allowing observations of ~20 molecular lines and thus an unbiased global survey of a GMC. Fully exploiting the wealth of information contained in this massive dataset (~1 000 000 pixels, 240 000 spectral channels per pixel) brings new challenges, and Machine Learning approaches offer new opportunities to extract hidden patterns from our data.

In this seminar, I will present several approaches that allowed used to explore this high-dimensional dataset. First, unsupervised approaches have allowed us to disentangle the underlying physical parameters controlling the observed molecular emission and the different physico-chemical phases. I will then present our use of a supervised machine learning approach (random forests) to predict the total quantity of matter along the line of sight (usually derived from infrared dust emission and requiring costly space telescopes) based only on ground-based radio observations. Finally, the interpretation of massive observational datasets such as ORION-B requires astrochemical models, linking the  observed molecules to the local physical conditions. The massive datasets of model grid results, exploring multi-dimensional parameter spaces and predicting thousands of observable quantities, also remain to be fully exploited. I will present here the statistical approach (based on machine learning tools) that we have applied to tackle a specific problem: the ionization fraction of the gas, which controls key physical and chemical processes in GMCs, is not directly observable. Classical tracers of the ionization fraction (e.g. DCO+/HCO+) are only detectable in the densest cores, precluding a complete view of the ionization fraction across a GMC. We propose a statistical approach, exploiting our large model grids to automatically find the best observable tracers of the ionization fraction among hundreds of species included in the model.

Tues. 3rd, November 2020 – David Cornu (LERMA) :
Deep learning for YSO classification and Milky Way extinction map reconstruction using Gaia and infrared surveys

Did you miss it ? You can watch it again on YouTube:


Abstract:

The observation of our home galaxy, the Milky Way (MW), is made difficult by our internal viewpoint, by stellar confusion, and screening by interstellar matter. The Gaia survey that contains around 1.6 billion star distances is the new flagship of MW structure and stellar populations. Concurrently, the past two decades have seen an explosion of the use of Machine Learning (ML) methods that are also increasingly employed in astronomy. With these methods it is possible to automatize complex problem solving and efficient extraction of statistical information from large datasets.

I will first describe the construction of a ML classifier to improve a widely adopted classification scheme for Young Stellar Object (YSO) candidates. Since YSOs probe dense star-forming regions, they can be combined with Gaia distance measurements to reconstruct dense clouds 3D structure. Our ML classifier is based on classical Artificial Neural Networks (ANN) and uses IR data from the Spitzer space telescope to reconstruct the YSO classification automatically from given examples.

In a second part, I will present a new method for reconstructing the 3D extinction distribution of the MW based on Convolutional Neural Networks (CNN). The CNN is trained using the Besançon Galaxy Model, and learns to infer the extinction distance distribution by comparing results of the model with observed data. This method is able to resolve distant structures up to 10 kpc with a formal resolution of 100 pc, and was found to be capable of combining 2MASS and Gaia datasets without the necessity of a cross match.

2020 warm up cycle

Wed. 15th, Jannuary 2020 – Liam Connor :
Machine learning for transient radio astrophysics
Anton Pannekoek Institute for Astronomy, University of Amsterdam

Did you miss it ? You can watch it again on YouTube:


Abstract:

Modern radio telescopes sample the sky’s electric field at an enormous rate, producing ~terabits per second of data. At Apertif, where we are searching for millisecond-duration fast radio bursts (FRBs), that deluge of data must quickly be reduced to a single bit: true astrophysical FRB, or false positive. To tackle this problem, I have built a lightweight, tree-like neural network that classifies FRB candidates using several different input data products. I will discuss this work, as well as future ideas for applying machine learning to radio astronomy using accelerated hardware.

Wed. 15th, Jannuary 2020 – Mauricio Araya :
Structure Characterization and Compact Representations for Astronomical Hyperspectral Images
Ing. Civil Telemática, Departamento de Electrónica, Universidad Técnica Federico Santa María, Chile

Did you miss it ? You can watch it again on YouTube:


Abstract:

In this talk, we will discuss four recent contributions for automatically detecting the emission components of extended sources in interferometric data cubes from the ALMA observatory. Our approaches try to move away from the classical pixel-based analysis, allowing to represent the source components compactly by using the sum of continuous functions. The two first techniques adapt classical image processing methods like morphological operators and wavelets for an additional spectral dimension. The objective is to identify multiscale regions of interest in the cubes with hierarchical connections. The other two techniques focus on the problem of working with very large cubes or images. Here we propose to fit sums of simple N-dimensional continuous functions as our source representation through optimization techniques. This not only generates a compact representation, but it can be directly used for analysis through data fusion and unsupervised machine learning approaches, such as moment preserving merges and hierarchical clustering.