It was discussed in MDL-23978 that cleaning features format_text() may break the content so that it can not be processed by portfolio_rewrite_pluginfile_urls(). I thought that the issue can be solved just by a correct order of these steps - format text first, rewrite URLs after.
But now I realized that the cleaning in format_text() does other things that were not spotted before: for example it replaces spaces in URLs with %20. Therefore the URL rewrite can not find the linked file then.
URL rewriting can not be called before HTML cleanup because HTML Purifier removes the converted <img> tags completely. And it can not be called after cleanup because URLs may be already modified by the purifier.
This was discussed with Penny and we agreed that the ony way is to call format_text() with noclean option. This has pretty serious impacts because a malicious code can be exported into a downloaded file or an external system. But as Petr mentioned in MDL-23978, it is the target system responsibility to filter the input anyway as they can not trust any source.
- has a non-specific relationship to
-
MDL-23978 format_text strips images with src attributes it doesn't like
-
- Closed
-