Skip to main content

Use Safe Parsers in `lxml` Parsing Functions

pixee:python/safe-lxml-parsing​

ImportanceReview GuidanceRequires Scanning Tool
HighMerge Without ReviewNo

This codemod sets the parser parameter in calls to lxml.etree.parse and lxml.etree.fromstring if omitted or set to None (the default value). Unfortunately, the default parser=None means lxml will rely on an unsafe parser, making your code potentially vulnerable to entity expansion attacks and external entity (XXE) attacks.

The changes look as follows:

  import lxml.etree
- lxml.etree.parse("path_to_file")
- lxml.etree.fromstring("xml_str")
+ lxml.etree.parse("path_to_file", parser=lxml.etree.XMLParser(resolve_entities=False))
+ lxml.etree.fromstring("xml_str", parser=lxml.etree.XMLParser(resolve_entities=False))

If you have feedback on this codemod, please let us know!

F.A.Q.​

Why is this codemod marked as Merge Without Review?​

We believe this change is safe, effective, and protects your code against very serious security attacks.

Codemod Settings​

N/A

References​