Ticket #12838 (closed Bug: fixed)

Opened 2 years ago

Last modified 18 months ago

UnicodeDecodeError: 'utf8' codec can't decode -- plone.app.theming

Reported by: jianaijun Owned by: ldr
Priority: minor Milestone: 4.x
Component: Diazo (plone.app.theming) Version: 4.2
Keywords: UnicodeDecodeError Cc: davisagli, davilima6

Description

Some Chinese characters in the theming, such as 国, 园,力, 程 will trigger :UnicodeDecodeError: 'utf8' codec can't decode ...

Traceback

2012-04-10 23:16:57 ERROR plone.transformchain Unexpected error whilst trying to apply transform chain
Traceback (most recent call last):
  File "/home/free/Plone/buildout-cache/eggs/plone.transformchain-1.0.2-py2.7.egg/plone/transformchain/transformer.py", line 48, in __call__
    newResult = handler.transformIterable(result, encoding)
  File "/home/free/Plone/buildout-cache/eggs/plone.app.theming-1.0b9-py2.7.egg/plone/app/theming/transform.py", line 208, in transformIterable
    transform = self.setupTransform()
  File "/home/free/Plone/buildout-cache/eggs/plone.app.theming-1.0b9-py2.7.egg/plone/app/theming/transform.py", line 153, in setupTransform
    xsl_params=xslParams,
  File "/home/free/Plone/buildout-cache/eggs/diazo-1.0rc4-py2.7.egg/diazo/compiler.py", line 106, in compile_theme
    read_network=read_network,
  File "/home/free/Plone/buildout-cache/eggs/diazo-1.0rc4-py2.7.egg/diazo/rules.py", line 154, in process_rules
    rules_doc = expand_themes(rules_doc, parser, absolute_prefix, read_network)
  File "/home/free/Plone/buildout-cache/eggs/diazo-1.0rc4-py2.7.egg/diazo/rules.py", line 75, in expand_themes
    theme_doc = etree.parse(url, parser=parser)
  File "lxml.etree.pyx", line 2957, in lxml.etree.parse (src/lxml/lxml.etree.c:56230)
  File "parser.pxi", line 1533, in lxml.etree._parseDocument (src/lxml/lxml.etree.c:82313)
  File "parser.pxi", line 1562, in lxml.etree._parseDocumentFromURL (src/lxml/lxml.etree.c:82606)
  File "parser.pxi", line 1462, in lxml.etree._parseDocFromFile (src/lxml/lxml.etree.c:81645)
  File "parser.pxi", line 1002, in lxml.etree._BaseParser._parseDocFromFile (src/lxml/lxml.etree.c:78554)
  File "parser.pxi", line 569, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:74498)
  File "parser.pxi", line 646, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:75355)
  File "lxml.etree.pyx", line 282, in lxml.etree._ExceptionContext._raise_if_stored (src/lxml/lxml.etree.c:7467)
  File "parser.pxi", line 447, in lxml.etree._local_resolver (src/lxml/lxml.etree.c:73358)
  File "docloader.pxi", line 146, in lxml.etree._ResolverRegistry.resolve (src/lxml/lxml.etree.c:69579)
  File "/home/free/Plone/buildout-cache/eggs/plone.app.theming-1.0b9-py2.7.egg/plone/app/theming/utils.py", line 103, in resolve
    result = result.decode(encoding).encode('ascii', 'xmlcharrefreplace')
  File "/home/free/Plone/Python-2.7/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe5 in position 186: invalid continuation byte

ENV

Plone 4.2b2 (4203)
CMF 2.2.5
Zope 2.13.12
Python 2.7.2 (default, Apr 9 2012, 13:07:54) [GCC 4.4.5]
Linux debian 2.6.32-5-amd64 #1 SMP Sat Mar 31 04:00:05 UTC 2012 x86_64 GNU/Linux
plone.app.theming-1.0b9-py2.7.egg

ENV 2

Plone 4.1.4 (4113)
CMF 2.2.5
Zope 2.13.12
Python 2.6.7 (r267:88850, Apr 7 2012, 19:41:58) [GCC 4.5.1 20100924 (Red Hat 4.5.1-4)]
plone.app.theming-1.0b8

Attachments

test-theme.zip Download (779 bytes) - added by jianaijun 2 years ago.
Simple test theming

Change History

Changed 2 years ago by jianaijun

Simple test theming

comment:1 Changed 2 years ago by eleddy

  • Cc elro added
  • Component changed from Unknown to Diazo

comment:2 Changed 2 years ago by ldr

  • Cc davisagli added; elro removed
  • Status changed from new to confirmed

This is due to ZPublisher's HTTPResponse.setBody mangling the data from the subrequest which fetches the theme file - it also happens if you view the file at  http://127.0.0.1:8080/Plone/++theme++test-theme/index.html. For the code that does this, see:  http://zope3.pov.lt/trac/browser/Zope/branches/2.13/src/ZPublisher/HTTPResponse.py#L499

The best way to avoid the mangling would seem to be to set an appropriate Content-Type header charset on the plone.resource response. But for persistent resources it is just a standard OFS.File that gets returned and sets the Content-Type header.

Another option might be to work around it with plone.subrequest, but that seems wrong and potentially confusing.

comment:3 Changed 2 years ago by davilima6

  • Cc davilima6 added

comment:4 Changed 2 years ago by davisagli

I applied a fix in  https://github.com/plone/plone.resource/commit/bde2d382850593fc357ac56633f7ddf2ad6ace73 to explicitly store the encoding on text/html files imported from zip as utf-8, so that the publisher doesn't mangle the response. It will only take effect for new imports.

comment:5 Changed 19 months ago by ericof

It seems to be fixed for 4.2 and 4.3.

Should we close this?

comment:7 Changed 18 months ago by jianaijun

  • Status changed from confirmed to closed
  • Resolution set to fixed
Note: See TracTickets for help on using tickets.