Sentinel-2cloud maskingprocessingcompositing

Dealing with Clouds in Sentinel-2 Data: Masking Techniques That Work

Name: Off-Nadir Delta
Author: Kazushi Motomura

Kazushi MotomuraJanuary 2, 20267 min read

Dealing with Clouds in Sentinel-2 Data: Masking Techniques That Work

Quick Answer: Clouds are the biggest practical limitation of optical satellite data. Sentinel-2 Level-2A includes a Scene Classification Map (SCL) that classifies each pixel as cloud, cloud shadow, vegetation, water, etc. Mask pixels with SCL values 3, 8, 9, 10 to remove clouds and shadows. For persistent cloud cover, build temporal composites using the median or best-pixel approach across multiple dates. In tropical regions, expect 60-80% cloud cover — plan for composite windows of 2-3 months.

I once spent a week processing Sentinel-2 data over Borneo, only to realize that every single acquisition in my three-month window had more than 70% cloud cover. Welcome to tropical remote sensing.

Clouds are not a minor inconvenience in optical satellite work — they're the primary limiting factor. Globally, about 67% of Earth's surface is covered by clouds at any given time. In equatorial regions, the figure exceeds 80% during monsoon seasons.

The SCL Approach

Sentinel-2 Level-2A products include the Scene Classification Map (SCL), generated by the Sen2Cor atmospheric correction processor. It classifies every pixel into one of 12 categories:

SCL Value	Class	Action
0	No data	Exclude
1	Saturated/defective	Exclude
2	Dark area pixels	Keep (with caution)
3	Cloud shadows	Mask
4	Vegetation	Keep
5	Bare soils	Keep
6	Water	Keep
7	Cloud low probability	Keep or mask
8	Cloud medium probability	Mask
9	Cloud high probability	Mask
10	Thin cirrus	Mask
11	Snow/ice	Context-dependent

The conservative approach: mask everything with SCL values 0, 1, 3, 8, 9, and 10. This removes definite clouds, cloud shadows, thin cirrus, and defective pixels.

The aggressive approach: also mask SCL 7 (cloud low probability) and 2 (dark area pixels, which sometimes indicate undetected cloud shadow). This removes more potential contamination but also throws away more valid data.

SCL Limitations

The SCL isn't perfect. I've encountered several recurring issues:

Commission errors over bright surfaces: White sand beaches, salt flats, and limestone exposures are sometimes classified as clouds. If your study area includes bright surfaces, verify the SCL against the actual imagery.

Missed thin cirrus: Thin, semi-transparent cirrus clouds can pass through the SCL filter, especially at the "low probability" threshold. These contaminate reflectance values without being obvious in the classification.

Cloud shadow misplacement: Shadow detection depends on estimating cloud height and solar geometry. The shadow mask can be displaced by several hundred meters, missing the actual shadow while masking valid pixels nearby.

Snow/cloud confusion: In mountainous areas during winter, distinguishing fresh snow from clouds is genuinely difficult. The SWIR-based discrimination helps (snow absorbs SWIR; clouds don't), but it's not foolproof.

Building Cloud-Free Composites

When single acquisitions are too cloudy, compositing aggregates multiple dates to fill gaps.

Median Composite

Take all valid (non-clouded) pixels from a time window and compute the median value for each pixel. The median is preferred over the mean because it's resistant to outliers — a partially unmasked cloud or shadow won't corrupt the result as badly.

Window selection matters: Too short and you don't have enough clear observations. Too long and genuine surface changes (crop growth, seasonal vegetation shifts) contaminate the composite. Rules of thumb:

Temperate regions, summer: 1-month window usually sufficient
Temperate regions, winter: 2-3 months (more clouds, less frequent clear sky)
Tropical regions: 3-6 months for dry season composite; may not be possible during wet season
Arid regions: 2-week window often sufficient

Best-Pixel Composite

Instead of the median, select the single "best" observation for each pixel — usually the one with the highest NDVI (for vegetation) or the lowest blue-band reflectance (as a proxy for least atmospheric contamination).

Best-pixel composites preserve more spectral fidelity than median composites because they use actual observations rather than statistical summaries. The downside is that adjacent pixels may come from different dates, creating spatial inconsistencies in rapidly changing landscapes.

Harmonic Fitting

For annual monitoring, fitting a harmonic function (sine/cosine curves) to the time series of valid observations provides a modeled estimate for any date. This handles irregular observation gaps elegantly and produces smooth, continuous time series. It's more complex to implement but produces cleaner results for phenology tracking.

Practical Workflow

Here's the workflow I use most frequently:

Download Level-2A data for your area and time window
Apply SCL mask: Remove pixels with SCL values 0, 1, 3, 8, 9, 10
Check coverage: Calculate the percentage of valid pixels per scene. Discard scenes with less than 20% valid coverage — they contribute noise without useful data
Composite: If single-date coverage is insufficient, build a median composite from remaining valid observations
Visual QC: Always inspect the result. Zoom to areas where you suspect residual cloud contamination. Compare against the input scenes.

The Cloud Cover Metadata Trap

Sentinel-2 metadata includes a scene-level cloud cover percentage. It's tempting to use this to filter — "give me all scenes with less than 20% clouds." But this number represents the entire scene, not your specific area of interest.

A scene might have 15% overall cloud cover, but if those clouds sit directly over your study area, the image is useless for your purpose. Conversely, a 60% cloud cover scene might be perfectly clear over your region.

Always check cloud cover spatially, not just numerically.

How Many Clear Scenes Can You Expect? Regional Reference

Planning a cloud masking strategy starts with knowing how many valid scenes you're likely to have. These are approximate annual counts of Sentinel-2 scenes with less than 20% cloud cover over the study area, based on climatological cloud frequency:

Region / Climate Type	Clear Scenes / Year	Composite Window Needed
Western Sahara / Arabian Peninsula	60–70	2–4 weeks
Mediterranean coast (summer)	40–60	1–2 months
Temperate continental (summer)	25–40	1–2 months
Temperate maritime (UK, NW France)	15–25	2–3 months
Temperate continental (winter)	5–15	3–6 months
Subtropical monsoon (dry season)	20–40	2–3 months
Subtropical monsoon (wet season)	2–8	May not be possible
Tropical equatorial (year-round)	8–20	3–6 months
Coastal rainforest (equatorial)	3–10	4–6 months

Note that Sentinel-2 has a 5-day revisit (with both satellites), producing roughly 70 potential acquisitions per year. In the Sahara you can use most of them; in equatorial rainforest you may have fewer than 15 genuinely usable scenes per year.

Planning rule of thumb: For a reliable median composite, you need at least 5–6 valid observations per pixel. If your regional cloud climatology suggests fewer, extend your composite window or reduce your expectations about the data's temporal specificity.

When Clouds Win

Sometimes there's no optical solution. Monsoon-season data in Southeast Asia, wet-season data in Central Africa, persistent stratus over coastal deserts — these situations defeat any amount of temporal compositing.

That's when SAR data becomes essential. Sentinel-1's C-band radar penetrates clouds completely. You lose the spectral information of Sentinel-2, but you gain reliable, all-weather observations. Many operational monitoring systems — flood mapping, deforestation detection in the tropics — rely on SAR precisely because clouds make optical monitoring unreliable.

The most robust monitoring systems fuse both: optical data when available (for its spectral richness) and SAR data when clouds prevent optical observations (for its all-weather reliability). This complementary approach acknowledges what no amount of cloud masking can fix — sometimes the sky simply isn't cooperating.

Kazushi Motomura

Remote sensing specialist with 10+ years in satellite data processing. Founder of Off-Nadir Lab. Master's in Satellite Oceanography (Kyushu University). Co-author, Remote Sensing Encyclopedia. More about the author →

Website X/Twitter GitHub

Dealing with Clouds in Sentinel-2 Data: Masking Techniques That Work

The SCL Approach

SCL Limitations

Building Cloud-Free Composites

Median Composite

Best-Pixel Composite

Harmonic Fitting

Practical Workflow

The Cloud Cover Metadata Trap

How Many Clear Scenes Can You Expect? Regional Reference

When Clouds Win

Related Articles

A Field Guide to Free Satellite Data Sources: Sentinel-1, Sentinel-2, VIIRS, and NASA FIRMS

The Landsat Program: A Complete Guide to 50+ Years of Earth Observation

Harmonized Landsat Sentinel (HLS): One Consistent Time Series from Two Sensors

From headline to satellite evidence

Pick an event on the Watchfloor

Ask Delta Agent why it matters

Look closer on the Map

Reuse the insight