-
Bug
-
Resolution: Fixed
-
Major
-
3.1.1
-
Redhat 6
PHP 5.6
-
MOODLE_31_STABLE
-
MOODLE_31_STABLE
-
MDL-55374-master -
HTML files created for converting online submissions to pdf for feedback/annotation (unoconv) are wrongly encoded (See attachment).
To reproduce (unoconv must be configured correctly on your moodle instance) :
- As a teacher, create an online text assignment
- As a student, submit a text with special characters like 'éèÉàÀ'
- As the teacher, go to the grade form of this student's assignment.
You should see that the text in the annotation is wrongly encoded
Dug the code a bit and found out what the problem is :
- The method called strip_image in "/mod/assign/feedback/editpdf/classes/document_services.php" is using DOMDocument::loadHTML which causes the problem.
- It seems that by default, DOMDOcument::loadHTML treat the string as being in ISO-8859-1 and not UTF-8. See http://stackoverflow.com/questions/8218230/php-domdocument-loadhtml-not-encoding-utf-8-correctly
Function format_text in "lib/weblib.php" is doing one of the proposed solution by appending "'<?xml version="1.0" encoding="UTF-8" ?>' ." to the string, to force loadHTML to work in UTF-8.
The following code modif would fix the issue :
diff --git a/mod/assign/feedback/editpdf/classes/document_services.php b/mod/assign/feedback/editpdf/classes/document_services.php
|
index 4ada0a7..e348bc5 100644
|
--- a/mod/assign/feedback/editpdf/classes/document_services.php
|
+++ b/mod/assign/feedback/editpdf/classes/document_services.php
|
@@ -119,7 +119,7 @@ EOD;
|
*/
|
protected static function strip_images($html) {
|
$dom = new DOMDocument();
|
- $dom->loadHTML($html);
|
+ $dom->loadHTML('<?xml version="1.0" encoding="UTF-8" ?>' . $html);
|
$images = $dom->getElementsByTagName('img');
|
$i = 0;
|
|
@@ -135,7 +135,7 @@ EOD;
|
$text = $dom->createTextNode($replacement);
|
$node->parentNode->replaceChild($text, $node);
|
}
|
- return $dom->saveHTML();
|
+ return trim(preg_replace('~<(?:!DOCTYPE|/?(?:html|body))[^>]*>\s*~i', '', $dom->saveHTML($dom->documentElement)));
|
}
|
|
/**
|