Interview to Jesús Marco de Lucas

Research Professor at Instituto de Física de Cantabria (IFCA), a joint center CSICUniversidad de Cantabria. Director of IFCA (2004-2007) and Physics Area Coordinator (2008-2010). Two research lines at IFCA: Experimental Particle Physics, Advanced Computing and e-Science. PhD on experimental particle physics in 1989, he cooperated in the search for the Higgs boson in the DELPHI experiment at the Large Electron Positron Collider (LEP) at CERN up to 2000. Member of the CMS collaboration at the Large Hadron Collider (LHC), he works in the Tier2 computing center at IFCA and in Data Preservation and Open Access tasks. Promoter of the European Grid Initiative (EGI), he is now involved in the European Open Science Cloud, as coordinator of the new DEEP Hybrid Data Cloud H2020 project, and in the EOSC-Hub stakeholder analysis, next to his current work within the INDIGO project on research requirements. He has coordinated the installation of computing infrastructure at IFCA, including GRIDCSIC resources and the ALTAMIRA supercomputer. He is also involved in the ESFRI Life Watch initiative, leading the Life Watch Competence Center. He works together several companies (CIC, Ecohydros, BSH and ENSA) in different research and innovation projects. Coordinator of a new master of Data Science (UIMP/University of Cantabria/CSIC).

Recently you came to Brussels to attend the “Research Data Management” Science Europe´s working group meeting. Could you tell us the relevance of this subject for CSIC as well as where we are and where we go to?

Research Data Management is a key subject within the framework of a Science increasingly based in a huge quantity of data, from sequencers to satellites, and it requires not only to analyze but to shape them, producing more data, and keeping record of them. The management of the whole life cycle of these data should allow that the final results could be reproduced, to get a transparent science, and also to preserve them, in order to be re-used by other scientists. On the other hand, to tackle complex problems we need data from different areas, and in this sense CSIC is in a unique position in Spain due to its multidisciplinary excellence. In parallel to this working group CSIC drafted an internal analysis on it. We hope that the next actions will come from the European initiatives to be launched by H2020 next year, which will make data more “findable, accessible, interoperable, re-usable and reproducible”.

The so-called European Open Science Cloud is expected to be a reality in 2020, what’s your opinion in this respect? Do you think the European industry is involved sufficiently in this question?

The European Open Science Cloud had originally three objectives: to promote open access for research results, not only data but also publications, to use cloud technology as a framework to implement this process and at the same time to offer more powerful services of computing.  Besides to get the companies involved in this process taking part in these services and making use of them to develop new products.

From my point of view the initial estimation of public resources to achieve the objective and their priority was not adequate. Nowadays the 250M€ foreseen in H2020 for the first objective maybe will allow to get a data cloud which add existing resources, but this is not neither enough to deliver resources to scientists to be competitive with China or USA, nor to attract the industry to this area.

We have not been able to identify so far, a company or a European consortium with capacity to play a similar role as Google or Amazon with the cloud services.

You have been taking part as partner or coordinator in different projects from the VthFP. Have you seen a positive evolution of the programme? What do you consider would be necessary to do/change that has not been done so far?  

It´s clear for me that it would be necessary to add a true and effective projects´ monitoring as well as of the involved stakeholders, and above all to take it into account when deciding to give continuity through new projects.  I have taken part not only as partner but as coordinator (now I will be coordinator again in the new Project), also as evaluator and advisor of the Ministry to evaluate the new proposals of the work program.

To be honest I have not seen a significant positive evolution, but I have to say that in general the projects and consortia I get to know, have been working very well and the promoted research lines have an adequate base and a scientific consensus, at least in the area of research infrastructures. But the weight of lobbies and big consortia is very high and hampers the renewal. That´s why a better monitoring is so relevant.

Along your career you have probably known other policies and strategies from other European organizations peer to CSIC aiming at increasing their participation in European projects or being better represented in international entities. Would do you highlight any specific policy that maybe could be appropriate to our institution?

Yes, in my opinion one of the best policies is to have the chance to explore new initiatives using own resources even if they are limited. How? Through preparatory workshops, initial test bed, designing prototypes in a small scale, and so on. All of them are options which allow us to start something new with trust and that extra point which an initial experience can provide, no matter if it has been positive or negative. Above all, the previous experience motivates the researcher because this is something that he or she loves to do without the pressure of a formal project, and developing her or his own ideas quickly. CSIC already allows doing this way thanks to the management of projects´ remnants, but sometimes is complicated.

For two years you were the coordinator of Physics area at CSIC, how was this experience and which were the objectives you wanted to reach? And right now, what are the biggest challenges of the area?  

Maybe it will be a surprise what I am going to say…but what I really got was a great trust in the scientists working at CSIC centers for the Physics area, from a lot of them I even did not know their work in detail. I personally visited all the centers, learnt so much about their research lines and my feeling was always very positive. At the same time I saw a big fragmentation, because co-operation between centers, apart from relevant exemptions, should have been better.

We set objectives very ambitious within a strategic plan prepared up to 2010, even when we knew the crisis was going to affect the implementation of a lot of support measures proposed (new positions, equipment, multidisciplinary lines and collaboration between different centers)

In my opinion, the renewal of subjects is the biggest challenge in Physic area right now, what in consequence entails the reorientation of a lot of Research groups.

While the evolution of Physics’ technologies keep exponentially increasing the progress in basic Physic is much more costly, especially if we compare it with other areas, as Biology.

To sum up, technology of XXI century is exploiting deeply (we can think about quantic computing) the Physic from XX century, even I dare to say we need a new Physic and Mathematics revolution, with new principles, if we want Physics stepping up again as in last century.

From a scientific point of view, which are the organizations you work with and who are the pioneers?  

From the beginning of my professional career I have been collaborating with different groups in CERN, especially from CNRS (IN2P3-LAL), INFN (Padua) and the CERN itself, and several others, including a center from CSIC: IFIC, in Spain.

In the computing area I am involved in a varied range of collaborations. In cloud/grid computing I would highlight the pioneer co-operation at that time, (Ibergrid) with the LIP in Portugal, , with PSNC and CYFRONET in Poland, with the KIT in Germany, with (international) and currently with INFN(Bologna, Bari) in Italy; and in Spain with UPV, BIFI and CESGA, as well as in supercomputing with BSC. Within the ESFRI LifeWatch context with many Spanish groups from Universities (Navarra, Sevilla, Granada) and with con MNCN, RJB y EBD from CSIC, VLIZ in Belgium and HCMR in Greece, NEON in USA, an international reference in ecosystems´ monitoring: working with all of them we have set up Cloud services that will be pioneer for the first ESFRI in Spain: LifeWatch.

As LifeWatch Competence Center coordinator, could you explain for “no expert public” the advantages of these kinds of infrastructures?

To understand the problems of environment today, and in particular the biodiversity, for example the propagation of invasive specimens or the appearance of toxic sea weed in a lake, require the analysis of thousands of observations and measures: from the simpler but massive, as the temperatures registry, to the most sophisticated, as the multi-spectral images from satellites or the results of genetic analysis.

But to resolve them we need in general very complex models which require integrating these data to make simulations to analyze different solutions. The researchers and also the environment´s managers will have in the LifeWatch cloud the necessary computing services to improve the management of these massive data from multiple sources as well as their analysis and modeling.

In its initial phase, the LifeWatch Competence Center has managed more than 10 million of hours of computing, being the more active in the European federated cloud of EGI, that is expected to be the seed of the  European Open Science Cloud.

Finally, what would you like to achieve in short or long term?

In the last years I have seen the incredible progress of Deep Learning techniques. Thanks to a “Garantía Juvenil” contract from the Ministry (I thank the Ministry and ESF from here) we had a young researcher who implemented the identification of plants through images, and later on we had a student who applied technics to the collisions´ classifications within the Large Hadron Collider (LHC) of CERN, and… it works so well! But we do not know why!  I imagine a new generation of analysis where we will not design the algorithm but the “brain” which will implement it afterwards and whose rules do not have to be the same as those of our mind.

We need very solid equipment in Data Science, including machine learning, and that is what I am working on the next two years, learning, thinking and applying…in the new master as well as in the DEEP-Hybrid Data Cloud project.