Unlock the power of Python to access, process, and visualize petabytes of environmental data in the cloud. In this hands-on workshop, you’ll connect to real-world climatological and meteorological datasets, including the petabytes-large Coupled Model Intercomparison Project (CMIP6) and NOAA's hourly Real-Time Mesoscale Analysis (RTMA) data, stored in Amazon S3 and Google Cloud Storage. Learn to efficiently retrieve and analyze data using Dask, Xarray, and fsspec, all without downloading or storing massive files locally. From lazy loading and data chunking to spatiotemporal analysis and visualization, you’ll master the tools and workflows needed to work with big environmental data at scale.

A Python Workshop To Efficiently Access, Deliver, Visualize, and Analyze Big Environmental Data in the Cloud

Ryan Paul Lafler

Saturday, May 9, 2026

12:00 pm - 3:00 pm PT

Virtual 3-Hour Workshop

$29.95

Workshop Tags

Python, Big Data, Cloud Computing, Environmental Data, Amazon S3, Google Cloud Storage, Dask, Xarray, Climate Data, Meteorology, NetCDF, GRIB2, CMIP6, NOAA, ECMWF

Workshop Description

This virtual, hands-on Python workshop, presented by Ryan Paul Lafler, provides an applied introduction to accessing, processing, visualizing, and analyzing large environmental and weather datasets directly in the cloud. Designed for Python programmers, data scientists, environmental and climate scientists, researchers, students, and applied practitioners, the workshop focuses on efficient, scalable strategies for working with petabyte-scale data stored in Amazon S3, Google Cloud Storage (GCS), and Microsoft Azure, without downloading entire repositories or duplicating data locally.

Previously presented at the American Meteorological Society (AMS), this workshop gives attendees the practical skills and working knowledge needed to work effectively with large environmental datasets in cloud environments using modern big-data processing and computation strategies. Participants learn how to move beyond local file-based workflows and adopt cloud-native analysis patterns that scale from personal systems to larger computing environments.

Attendees will build Python data pipelines capable of selectively accessing, processing, and retrieving only the environmental information they need, even when datasets span petabytes of open climate and meteorological data. Using modern open-source Python libraries such as Dask for scalable computation, Xarray for multidimensional spatiotemporal data, and fsspec for cloud object storage access, participants will learn how to efficiently process, visualize, and analyze large climatological and meteorological datasets directly from the cloud.

Key topics covered include:

Connecting to cloud object storage (Amazon S3, GCS, Azure) using Python and fsspec
Working with real-world datasets, including the Coupled Model Intercomparison Project (CMIP6) and NOAA’s Real-Time Mesoscale Analysis (RTMA)
Understanding environmental data formats: NetCDF, GRIB2, HDF5, and Zarr
Efficient data access patterns, including lazy loading, chunking, and graph-based execution
Building scalable data pipelines with Dask and Xarray
Extracting time series and performing spatiotemporal analysis
Visualizing environmental data using Matplotlib, Cartopy, and Rasterio

All participants will receive a Certificate of Completion, the non-redistributable PDF slides, a documented Jupyter Notebook, and cloud data access examples to support continued practice, skill-building, and independent learning after the workshop.

Workshop Summary

Unlock the power of Python to access, process, and visualize petabytes of environmental data in the cloud. In this hands-on workshop, you’ll connect to real-world climatological and meteorological datasets, including the petabytes-large Coupled Model Intercomparison Project (CMIP6) and NOAA's hourly Real-Time Mesoscale Analysis (RTMA) data, stored in Amazon S3 and Google Cloud Storage. Learn to efficiently retrieve and analyze data using Dask, Xarray, and fsspec, all without downloading or storing massive files locally. From lazy loading and data chunking to spatiotemporal analysis and visualization, you’ll master the tools and workflows needed to work with big environmental data at scale.

Benefits of Enrolling in this Workshop

By enrolling in this workshop, attendees will:

Earn a Certificate of Completion from Premier Training by Premier Analytics Consulting
Learn to connect Python applications directly to cloud-based data sources, including Amazon S3, Google Cloud Storage (GCS), and Microsoft Azure
Build reusable, efficient Python pipelines for retrieving and processing large climatological and meteorological datasets
Understand and apply big-data techniques such as lazy loading, chunking, and parallel computing for scalable analysis
Use modern open-source Python tools, including Dask, Xarray, fsspec, and visualization libraries such as Cartopy and Matplotlib
Gain hands-on experience with real, research-grade environmental data and simulations, mirroring workflows used in operational and academic settings

Intended Audience

This workshop is suitable for a broad audience, including data scientists, researchers, programmers, students, educators, and working professionals who are interested in cloud computing and large-scale environmental data analysis using Python. The material is designed to be accessible to those with basic Python experience, while still providing practical value and scalable techniques that intermediate users can immediately apply on their local systems.

Prerequisites

Basic familiarity with Python programming, including running scripts and working with common data structures
Comfort working in Jupyter Notebook or similar interactive Python environments
Prior exposure to data analysis or environmental datasets is helpful but not required

About the Trainer(s)

Ryan Paul Lafler is the President, CEO, and Lead Consultant of Premier Analytics Consulting, LLC, a San Diego–based consulting firm specializing in applied AI and machine learning systems, big data engineering, statistical analysis, and developing custom full-stack analytics platforms. He partners with enterprise organizations, public-sector agencies, and research institutions to help clients process, analyze, and model complex data in support of informed decision-making.

Ryan brings experience as a consultant, programmer, full-stack developer, AI/ML engineer, and data scientist. His work focuses on building open-source analytics platforms, scalable big data engineering solutions, rigorous statistical analysis, and applied AI workflows, with a strong emphasis on reliability, validation, and production readiness. He has expertise in cross-industry programming for analytics, AI, and data science using Python, R, SQL/NoSQL, SAS®, and modern JavaScript frameworks, and regularly implements quality controls and validation practices for AI-generated code and automated analytics workflows.

Ryan is also an Adjunct Faculty member in the Big Data Analytics Graduate Program, the Department of Mathematics and Statistics, and the Global Campus at San Diego State University.

He earned a Master of Science in Big Data Analytics (2023), following the successful defense and publication of his thesis, and a Bachelor of Science in Statistics with a Minor in Quantitative Economics, with Distinction (2020), both from San Diego State University.

A Python Workshop To Efficiently Access, Deliver, Visualize, and Analyze Big Environmental Data in the Cloud

Directory

​​​