Bad Encoding Scenario

Select scenario:
Cyrillic German Mixed CP1252 Latin-1 Invalid UTF-8
Enable iframe (mixed encoding)
Challenge: The HTTP Content-Type header declares a wrong charset. Your scraper must detect the actual encoding and decode the content correctly.
Declared charset (header)utf-8
Actual encoding (body)cp1252
ScenarioRaw CP1252 bytes with smart quotes and special symbols but header declares charset=utf-8

CP1252 Smart Quotes

“Hello World” – said the ‘developer’ with a …pause.

Price: €100 – €200 (euro sign in CP1252)

Trademark™ and Copyright© symbols.

This café serves crème brûlée.


All scenarios | Home