Python Forum

Full Version: parsing mutipart form data in Lambda
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I am working on developing an AWS Lambda function in python. This function is behind AWS API Gateway.
I upload two files from the URL and in lambda function I get event data as below.
I need help how to parse this response and save the two files in S3.


----------------------------124218046032878137249340
Content-Disposition: form-data; name="file1"; filename="2.csv"

Content-Type: text/csv

header,value

a,1
b,2

----------------------------124218046032878137249340

Content-Disposition: form-data; name="file2"; filename="1.txt"

Content-Type: text/plain

aa
bb
cc

----------------------------124218046032878137249340--
If it's a HTTP response could use Requests to parse it.
Data after Content-Type: text/plain can get with .text or .contend.
>>> import requests
>>> r = requests.get('http://httpbin.org/')
>>> r.status_code
200
>>> r.headers['content-type']
'text/html; charset=utf-8'

>>> print(r.text[:95])
<!DOCTYPE html>
<html>
<head>
  <meta http-equiv='content-type' value='text/html;charset=utf8'>
Its not a HTTP response, in the service side code . There is a Lambda function , that get the data as bytes array stream.
Depends on what you want,quick and dirty way to get contend with regex.
import re

data = '''\
----------------------------124218046032878137249340
Content-Disposition: form-data; name="file1"; filename="2.csv"

Content-Type: text/csv

header,value

a,1
b,2

----------------------------124218046032878137249340

Content-Disposition: form-data; name="file2"; filename="1.txt"

Content-Type: text/plain

aa
bb
cc'''

r_csv = re.search(r'Content-Type: text/csv(.*\,\w)', data, re.DOTALL)
lst = [i for i in r_csv.group(1).split('\n') if i != '']
result = [i.split(',') for i in lst]
print(result)
print('--------------------------')
r_text = re.search(r'Content-Type: text/plain(.*)', data, re.DOTALL)
print([i for i in r_text.group(1).split('\n') if i != ''])
Output:
[['header', 'value'], ['a', '1'], ['b', '2']] -------------------------- ['aa', 'bb', 'cc']