Webinar: Regulatory-Grade Multimodal Medical Data De-Identification and Tokenization by Data Science Salon

Share This Webinar

About This Webinar

Healthcare and life science organizations are increasingly working with large-scale, multimodal datasets that include structured records, clinical notes, diagnostic images, and PDF documents.
Sharing this data for research and AI development requires rigorous de-identification to ensure patient privacy — without compromising the ability to extract insights across time and modalities.

In this webinar, experts from John Snow Labs and Databricks will demonstrate an end-to-end solution for automating the de-identification and tokenization of medical data with regulatory-grade accuracy. You’ll learn how to:

- Automatically de-identify structured data, unstructured text, DICOM & JPEG images, whole-slide pathology images (SVS), and PDFs using John Snow Labs’ industry-leading software and AI models
- Apply patient tokenization to enable linking of de-identified data across modalities and time points
- Use Databricks to process and scale these capabilities across large, real-world datasets
- Support HIPAA, GDPR, and other regulatory requirements for privacy-preserving research

This session is ideal for data scientists, clinical researchers, compliance teams, and healthcare IT leaders working with multimodal patient data who want to enable longitudinal, privacy-compliant research at scale.

When: Wednesday, July 16, 2025 · 2:00 p.m. · Eastern Time (US & Canada)

Duration: 1 hour

Language: English

Who can attend? Everyone

Dial-in available? (listen only): No

Featured Presenters

Data Science Salon

VIEW PROFILE

Srikanth Kumar Rana

Solutions Architect, Databricks

Srikanth Kumar Rana is a seasoned Field Engineer at Databricks, bringing extensive experience in helping organizations unlock the full potential of data and AI. With a strong focus on empowering customers, Srikanth has consistently demonstrated expertise in complex deployments, driving adoption, and enabling businesses to achieve tangible outcomes on the Databricks Lakehouse platform.

VIEW PROFILE

Youssef Mellah, Ph.D.

Senior Data Scientist, Machine Learning Engineer, John Snow Labs

Youssef Mellah, Ph.D., is a Senior Data Scientist and Machine Learning Engineer at John Snow Labs, specialist with more than 8 years of experience in artificial intelligence, natural language processing, and deep learning. He specializes in building, training, and deploying regulatory-grade ML/DL models and large language models (LLMs) for healthcare and life sciences, including the de-identification and tokenization of multimodal medical data. Youssef has a strong track record designing scalable, privacy-preserving AI solutions that enable compliant research and analytics across structured and unstructured data. He is passionate about advancing NLP technology, leading multidisciplinary teams, and transforming cutting-edge research into practical, real-world applications.

VIEW PROFILE

Hosted By

Data Science Salon

The DATA SCIENCE SALON is a unique vertical-focused data science conference that grew into a diverse community of senior data science, machine learning, and other technical specialists. We gather face-to-face and virtually to educate each other, illuminate best practices and innovate new solutions in a casual atmosphere.

VIEW CHANNEL CONTACT