I’m using python pandas and flask for some postprocessing tasks (anlaysis and visualization). Until now I uploaded/read *.csv *.xlsx and *.xls via pd.read_csv, pd.read_xlsx. Everything worked quiet fine.
Now I have a *.xml file as datasource and tried according my habit pattern.
So i tried:
<form action="/input" method="POST" enctype="multipart/form-data"> <input class="form-control" type="file" name="file"> <input type="submit" class="btn btn-outline-secondary" name="Preview" value ="Preview Data" > </input> from flask import Flask, render_template,request, render_template import pandas as pd import xml.etree.ElementTree as ET @app.route("/input", methods=['POST', 'GET']) def input(): if request.method == 'POST': if request.form['Preview'] == "Preview Data": file = request.files['file'] filename = file.filename if '.xml' in filename: content = pd.read_xml(file, parser='lxml')
But when I pass a .xml file to the app via the form. I get the error:
File "C:\ProgramData\MiniforgeEnvs\TestEnv\lib\site-packages\pandas\io\xml.py", line 627, in _parse_doc with preprocess_data(handle_data) as xml_data: AttributeError: __enter__
I tried check different options:
- when I use the inbuild xml.etree package it works fine:
import xml.etree.ElementTree as ET if '.xml' in filename: tree = ET.parse(file) root = tree.getroot() print(root.attrib)
- when I load the .xml direct from the app directory into pd.read_xml() it also works fine:
if '.xml' in filename: content = pd.read_xml('SampleExport.xml', parser='lxml')
- I tried different prasers: "lxml" and "etree"
But at the end when I pass the .xml via the Form/input and using pd.read_xml(file,parser=’lxml’) I got the error from above.
I just solved my issue even though I’m not quite sure why pd.read_xml() behaves different compared to pd.read_csv() or pd.read_xlsx().
pd.read_xml is not able to read a FileStorage object. The variable passed by request.file is a instance of the class: werkzeug.datastructures.FileStorage(stream=None, filename=None, name=None, content_type=None, content_length=None, headers=None).
Via the read function I extracted the file itsself.
filestorage = request.files['file'] file=filestorage.read()
with this passed to pd.read_xml it works fine.
Is there anybody who can explain why _parse_doc() funtion of pd.read_xml() is not able to read FileStotage type?
Answered By – To0bias