Asma DHAOUADI

Doctoral student, LISTIC, USMB

Contact

E-mail : asma.dhaouadi@univ-smb.fr

Telephone: +33(0) 765228805

Office: A221-A222

Address 1: LISTIC - Polytech Annecy-Chambery, BP 80439, 74944 Annecy le Vieux Cedex, France

Thesis

Group: LISTIC -ReGaRD team

Theme: Modeling Data Warehousing in the context of Big Data

Subject: CONTRIBUTING TO MASSIVE DATA STORAGE: GENERIC ARCHITECTURE, METHODOLOGY AND IMPLEMENTATION

Summary:

Data warehouses are indispensable for all information systems, as they play a key role in decision-making. The typical architecture of a Doctoral School consists mainly of four parts: data sources, data preparation, target data storage, and data access and analysis. At the heart of this architecture lies the ETL process for Extracting, Transforming and Loading data into the target database, for visualization, reporting, analysis and decision-making purposes. In the era of Big Data, the major challenge facing the community is to evolve Doctoral School traditional architectures, and in particular the classic ETL process, to support the requirements of . The state of the art reveals two limitations. The first concerns Big Data approaches based on various dedicated technologies, such as the Hadoop ecosystem, Flink, Kafka, Kibana and others. These are evolving rapidly, to the point where the architectures of Doctoral School are becoming obsolete compared with the latest technologies. The second is that there is no standard model for the representation and design of ETL processes. Despite the contributions of ETL process modeling work in the literature, the design of a generic ETL model capable of homogenizing the various contemporary approaches is still a topical issue. For these reasons, using model-driven engineering (MDE) as a generic framework and model-driven architecture (MDA) as a specific framework, we aim in this thesis to propose a new generic ETL model and a new generic architecture for massive data warehousing supporting this model. This architecture could be instantiated according to specific technologies depending on the application domain. In addition, we propose a methodology to help the expert implement an architecture that meets the specific needs of his company, based on the generic architecture. Finally, we validate all research work carried out on a practical case, such as the medical field (Pandemic covid-19) or other applications.

Keywords: Data Warehouse, ETL Process Modeling, Data Warehousing Architectures, Knowledge Discovery, Meta-Model, Generic Methodology
Publications :
A Multi-Layer Modeling for the Generation of New Architectures for Big Data Warehousing - https://hal.archives-ouvertes.fr/hal-03537854
A Two Level Architecture for Data Warehousing and OLAP Over Big Data - https://www.archives-ouvertes.fr/hal-02382486
Data Warehousing Process Modeling from Classical Approaches to New Trends: Main Features and Comparisons - https://hal.archives-ouvertes.fr/hal-03758493

 

Supervisor: Sébastien Monnet & Mohamed Mohsen Gammoudi

Co-supervisor: Khadija Arfaoui

Start of thesis: January 2021