Content-Type header declares a wrong charset.
Your scraper must detect the actual encoding and decode the content correctly.
| Declared charset (header) | utf-8 |
|---|---|
| Actual encoding (body) | iso-8859-1 |
| Scenario | French/Spanish accented text as raw Latin-1 bytes but header declares charset=utf-8 |
Café, naïve, résumé – common French words.
Español: ¿Dónde está la biblioteca? ¡Hola mundo!
Português: A criança é muito inteligente.
À la carte, pièce de résistance, tête-à-tête.
This iframe is served as ISO-8859-1 bytes with header charset=utf-8 — a different mismatch than the main page.