Special Offer: Get 50% off your first 2 months when you do one of the following
Personalized offer codes will be given in each session

(BigData 2020) Ensemble learning for heterogeneous missing data imputation

About This Webinar

Abstract: Missing values can significantly affect the result of analyses and decision making in any field. Two major approaches deal with this issue: statistical and model-based methods. While the former brings bias to the analyses, the latter is usually designed for limited and specific use cases. To overcome the limitations of the two methods, we present a stacked ensemble framework based on the integration of the adaptive random forest algorithm, the Jaccard index, and Bayesian probability. Considering the challenge that the heterogeneous and distributed data from multiple sources represents, we build a model in our use case, that supports different data types: continuous, discrete, categorical, and binary. The proposed model tackles missing data in a broad and comprehensive context of massive data sources and data formats. We evaluated our proposed framework extensively on five different datasets that contained labelled and unlabelled data. The experiments showed that our framework produces encouraging and competitive results when compared to statistical and model-based methods. Since the framework works for various datasets, it overcomes the model-based limitations that were found in the literature review.

Authors: Andre L Costa Carvalho (Université du Québec, Canada); Darine Ameyed (ETS, Canada); Mohamed Cheriet (Ecole de technologie superieure (University of Quebec), Canada)

Email: andre-luis.costa-carvalho.1@ens.etsmtl.ca, darine.ameyed.1@ens.etsmtl.ca, mohamed.cheriet@etsmtl.ca

Who can view: Everyone
Webinar Price: Free
Featured Presenters
Webinar hosting presenter Services Society
Research interest:
His research interests include Data Science and Machine Learning. He is an enthusiast that has been working for the last 20 years in different multinational companies in the ICT area. Currently researching big data, data mining, and data fusion.

Short Biography:

In April 2018, he started his master degree in the Machine learning field in the Synchromedia laboratory at ETS-École de Technologie Supérieure, University of Quebec,
under the supervision of Prof. Mohamed Cheriet.
He obtained his MBA in Innovation, Sustainability, and Governance at USP - University of São Paulo (São Paulo, Brazil) in 2017
He received a B.Sc. in Game Design from the Anhembi Morumbi University (São Paulo, Brazil) in 2014.
Hosted By
Services Society webinar platform hosts (BigData 2020) Ensemble learning for heterogeneous missing data imputation
Services Society's Channel