PhD student, LISTIC, USMB
Contact
E-mail: asma.dhaouadi@univ-smb.fr
Telephone: +33(0) 765228805
Office: A221-A222
Address 1: LISTIC - Polytech Annecy-Chambery, BP 80439, 74944 Annecy le Vieux Cedex, France
Thesis
Group: LISTIC - ReGaRD team
Theme: Modelling Data Warehousing in the context of Big Data
Topic: CONTRIBUTION TO MASSIVE DATA STORAGE: GENERAL ARCHITECTURE, METHODOLOGY AND IMPLEMENTATION
Summary:
Data Warehouses are indispensable for all information systems as they play a key role in decision making. The typical architecture of a Doctoral School is mainly composed of four parts: data sources, data preparation, target data storage, and data access and analysis. At the heart of this architecture is the ETL process for Extracting, Transforming and Loading data into the target database for visualisation, reporting, analysis and decision making. In the era of Big Data, the major challenge for the community is to evolve the Doctoral School traditional architectures, and in particular the classical ETL process to support the requirements of . The state of the art reveals two limitations. The first concerns Big Data approaches based on various dedicated technologies, such as the Hadoop ecosystem, Flink, Kafka, Kibana, etc. These are evolving rapidly, to the point where they are no longer sufficient to meet the needs of the market. These are evolving rapidly, to the point where the architectures of Doctoral School are becoming obsolete compared to the latest technologies. The second is that there is no standard model for the representation and design of ETL processes. Despite the contributions of the work on ETL process modelling in the literature, the design of a generic ETL model capable of homogenising the different contemporary approaches is still a challenge. For these reasons, based on Model Driven Engineering (MDE) as a generic framework and Model Driven Architecture (MDA) as a specific framework, we seek in this thesis to propose a new generic ETL model and a new generic architecture for massive data warehousing supporting this model. This architecture could be instantiated according to specific technologies depending on the application domain. In addition, we also propose a methodology to help the expert to implement an architecture that meets the specificities of his company based on the generic architecture. Finally, we validate all the research work carried out on a practical case such as the medical field (Pandemic covid-19) or other applications.
A Multi-Layer Modeling for the Generation of New Architectures for Big Data Warehousing - https://hal.archives-ouvertes.fr/hal-03537854
A Two Level Architecture for Data Warehousing and OLAP Over Big Data - https://www.archives-ouvertes.fr/hal-02382486
Data Warehousing Process Modeling from Classical Approaches to New Trends: Main Features and Comparisons - https://hal.archives-ouvertes.fr/hal-03758493
Supervisor: Sébastien Monnet & Mohamed Mohsen Gammoudi
Co-supervisor: Khadija Arfaoui
Start of the thesis: January 2021