(BigData 2020) Ensemble learning for heterogeneous missing data imputation

About This Webinar

Abstract: Missing values can significantly affect the result of analyses and decision making in any field. Two major approaches deal with this issue: statistical and model-based methods. While the former brings bias to the analyses, the latter is usually designed for limited and specific use cases. To overcome the limitations of the two methods, we present a stacked ensemble framework based on the integration of the adaptive random forest algorithm, the Jaccard index, and Bayesian probability. Considering the challenge that the heterogeneous and distributed data from multiple sources represents, we build a model in our use case, that supports different data types: continuous, discrete, categorical, and binary. The proposed model tackles missing data in a broad and comprehensive context of massive data sources and data formats. We evaluated our proposed framework extensively on five different datasets that contained labelled and unlabelled data. The experiments showed that our framework produces encouraging and competitive results when compared to statistical and model-based methods. Since the framework works for various datasets, it overcomes the model-based limitations that were found in the literature review.

Authors: Andre L Costa Carvalho (Université du Québec, Canada); Darine Ameyed (ETS, Canada); Mohamed Cheriet (Ecole de technologie superieure (University of Quebec), Canada)

Email: andre-luis.costa-carvalho.1@ens.etsmtl.ca, darine.ameyed.1@ens.etsmtl.ca, mohamed.cheriet@etsmtl.ca

Who can view: Everyone

Webinar Price: Free

Featured Presenters

Andre L Costa Carvalho

Building the Modern Services Industry

Research interest:
His research interests include Data Science and Machine Learning. He is an enthusiast that has been working for the last 20 years in different multinational companies in the ICT area. Currently researching big data, data mining, and data fusion.

Short Biography:

In April 2018, he started his master degree in the Machine learning field in the Synchromedia laboratory at ETS-École de Technologie Supérieure, University of Quebec,
under the supervision of Prof. Mohamed Cheriet.
He obtained his MBA in Innovation, Sustainability, and Governance at USP - University of São Paulo (São Paulo, Brazil) in 2017
He received a B.Sc. in Game Design from the Anhembi Morumbi University (São Paulo, Brazil) in 2014.

VIEW PROFILE

Hosted By

Services Society

Services Society's Channel

VIEW CHANNEL CONTACT

Recommended