Posts: 89
Threads: 26
Joined: Mar 2020
Mar-23-2020, 09:40 PM
(This post was last modified: Mar-23-2020, 09:40 PM by BrandonKastning.)
Hello everyone!
How would someone take the following file:
Supreme Court of the United States (1 U.S. 1) (1754)
I believe this is a "nested" JSON file? I would like to find a simple way to convert it to a CSV file.
Thanks in advance!
Posts: 7,324
Threads: 123
Joined: Sep 2016
Mar-23-2020, 11:03 PM
(This post was last modified: Mar-23-2020, 11:04 PM by snippsat.)
As it's on GitHub to get all.
git clone https://github.com/brianwc/bulk_scotus.git Get single file click Raw then right click save as.
The normal way is when read json is that Python convert it a dictionary.
So a demo taking this file down with Requests,local on disk use json module.
>>> import requests
>>> from pprint import pprint
>>> url = 'https://raw.githubusercontent.com/brianwc/bulk_scotus/master/1700s/1754/84581.json'
>>> response = requests.get(url)
>>> record = response.json()
>>> pprint(record)
{'absolute_url': '/opinion/84581/anonymous/',
'blocked': False,
'citation': {'case_name': 'Anonymous',
'docket_number': None,
'document_uris': ['/api/rest/v2/document/84581/'],
'federal_cite_one': '1 U.S. 1',
'federal_cite_three': None,
'federal_cite_two': None,
'id': 51503,
'lexis_cite': '',
'neutral_cite': None,
'resource_uri': '/api/rest/v2/citation/51503/',
'scotus_early_cite': None,
'specialty_cite_one': None,
'state_cite_one': None,
'state_cite_regional': None,
'state_cite_three': None,
'state_cite_two': None,
'westlaw_cite': None},
'citation_count': 14,
'court': '/api/rest/v2/jurisdiction/scotus/',
'date_blocked': None,
'date_filed': '1754-09-01',
'date_modified': '2014-12-21T01:23:36.691370',
'docket': '/api/rest/v2/docket/515705/',
'download_url': 'http://bulk.resource.org/courts.gov/c/US/1/1.US.1.html',
'extracted_by_ocr': False,
'html': '<p class="case_cite">1 U.S. 1</p>\n'
' <p class="case_cite">1 Dall. 1</p>\n'
' <p class="case_cite">1 L.Ed. 11</p>\n'
' <p class="parties">Anonymous.</p>\n'
' <p class="court">Supreme Court of Pennsylvania</p>\n'
' <p class="date">September Term, 1754.</p>\n'
' <p>1 U.S. 1</p>\n'
' <p>1 Dall. 1</p>\n'
' <p>1 L.Ed. 11</p>\n'
' <div class="num" id="p1">\n'
' <span class="num">1</span>\n'
' <p>Anonymous.</p>\n'
' </div>\n'
' <p>Supreme Court of Pennsylvania</p>\n'
' <div class="num" id="p2">\n'
' <span class="num">2</span>\n'
' <p>September Term, 1754.</p>\n'
' </div>\n'
' <div class="num" id="p3">\n'
' <span class="num">3</span>\n'
' <p class="indent">Adjudged by the Court, that the Statute of '
'Frauds and Perjuries<a class="footnote" href="#fn-s" '
'id="fn-s_ref">*</a> does not extend to this Province, though made '
"before Mr. Penn's Charter: The Governor of New-York having exercised "
'A Jurisdiction here, before the making that Statute, by Virtue of '
'the Word Territories, in the Grant to the Duke of York, of New-York '
'and New-Jersey.</p>\n'
' </div>\n'
' <div class="footnotes">\n'
' <div class="footnote" id="fn-s">\n'
' <a class="footnote" href="#fn-s_ref">*</a>\n'
' <p> 29.Car.2.c.3. This statute was supplied, however by an '
'act of General Assembly passed the 12 Geo. 3. 31. 1 State Laws 462. '
'and sec 2 P. Will. 75.</p>\n'
' </div>\n'
' </div>\n'
' ',
'html_lawbox': '',
'html_with_citations': '<p class="case_cite"><span class="citation '
'no-link"><span class="volume">1</span> <span '
'class="reporter">U.S.</span> <span '
'class="page">1</span></span></p>\n'
' <p class="case_cite">1 Dall. 1</p>\n'
' <p class="case_cite">1 L.Ed. 11</p>\n'
' <p class="parties">Anonymous.</p>\n'
' <p class="court">Supreme Court of '
'Pennsylvania</p>\n'
' <p class="date">September Term, 1754.</p>\n'
' <p><span class="citation no-link"><span '
'class="volume">1</span> <span '
'class="reporter">U.S.</span> <span '
'class="page">1</span></span></p>\n'
' <p>1 Dall. 1</p>\n'
' <p>1 L.Ed. 11</p>\n'
' <div class="num" id="p1">\n'
' <span class="num">1</span>\n'
' <p>Anonymous.</p>\n'
' </div>\n'
' <p>Supreme Court of Pennsylvania</p>\n'
' <div class="num" id="p2">\n'
' <span class="num">2</span>\n'
' <p>September Term, 1754.</p>\n'
' </div>\n'
' <div class="num" id="p3">\n'
' <span class="num">3</span>\n'
' <p class="indent">Adjudged by the Court, that '
'the Statute of Frauds and Perjuries<a '
'class="footnote" href="#fn-s" id="fn-s_ref">*</a> '
'does not extend to this Province, though made before '
"Mr. Penn's Charter: The Governor of New-York having "
'exercised A Jurisdiction here, before the making that '
'Statute, by Virtue of the Word Territories, in the '
'Grant to the Duke of York, of New-York and '
'New-Jersey.</p>\n'
' </div>\n'
' <div class="footnotes">\n'
' <div class="footnote" id="fn-s">\n'
' <a class="footnote" href="#fn-s_ref">*</a>\n'
' <p> 29.Car.2.c.3. This statute was supplied, '
'however by an act of General Assembly passed the 12 '
'Geo. 3. 31. 1 State Laws 462. and sec 2 P. Will. '
'75.</p>\n'
' </div>\n'
' </div>\n'
' ',
'id': 84581,
'judges': '',
'local_path': None,
'nature_of_suit': '',
'plain_text': '',
'precedential_status': 'Published',
'resource_uri': '/api/rest/v2/document/84581/',
'sha1': 'd01a5ea90493b2357922308f2456c5d1479737f4',
'source': 'R',
'supreme_court_db_id': None,
'time_retrieved': '2010-04-28T08:59:25'} record is now a dictionary,so can access data easy.
>>> record['date_filed']
'1754-09-01'
>>> record['citation']['federal_cite_one']
'1 U.S. 1' Looking at data so do dictionary make more sense to work with this data than csv would do.
Can also take into Pandas,if need to do more manipulation of data.
Once inside a DataFrame then can also output easy to csv with df.to_csv() .
Posts: 89
Threads: 26
Joined: Mar 2020
(Mar-23-2020, 11:03 PM)snippsat Wrote: As it's on GitHub to get all.
git clone https://github.com/brianwc/bulk_scotus.git Get single file click Raw then right click save as.
The normal way is when read json is that Python convert it a dictionary.
So a demo taking this file down with Requests,local on disk use json module.
>>> import requests
>>> from pprint import pprint
>>> url = 'https://raw.githubusercontent.com/brianwc/bulk_scotus/master/1700s/1754/84581.json'
>>> response = requests.get(url)
>>> record = response.json()
>>> pprint(record)
{'absolute_url': '/opinion/84581/anonymous/',
'blocked': False,
'citation': {'case_name': 'Anonymous',
'docket_number': None,
'document_uris': ['/api/rest/v2/document/84581/'],
'federal_cite_one': '1 U.S. 1',
'federal_cite_three': None,
'federal_cite_two': None,
'id': 51503,
'lexis_cite': '',
'neutral_cite': None,
'resource_uri': '/api/rest/v2/citation/51503/',
'scotus_early_cite': None,
'specialty_cite_one': None,
'state_cite_one': None,
'state_cite_regional': None,
'state_cite_three': None,
'state_cite_two': None,
'westlaw_cite': None},
'citation_count': 14,
'court': '/api/rest/v2/jurisdiction/scotus/',
'date_blocked': None,
'date_filed': '1754-09-01',
'date_modified': '2014-12-21T01:23:36.691370',
'docket': '/api/rest/v2/docket/515705/',
'download_url': 'http://bulk.resource.org/courts.gov/c/US/1/1.US.1.html',
'extracted_by_ocr': False,
'html': '<p class="case_cite">1 U.S. 1</p>\n'
' <p class="case_cite">1 Dall. 1</p>\n'
' <p class="case_cite">1 L.Ed. 11</p>\n'
' <p class="parties">Anonymous.</p>\n'
' <p class="court">Supreme Court of Pennsylvania</p>\n'
' <p class="date">September Term, 1754.</p>\n'
' <p>1 U.S. 1</p>\n'
' <p>1 Dall. 1</p>\n'
' <p>1 L.Ed. 11</p>\n'
' <div class="num" id="p1">\n'
' <span class="num">1</span>\n'
' <p>Anonymous.</p>\n'
' </div>\n'
' <p>Supreme Court of Pennsylvania</p>\n'
' <div class="num" id="p2">\n'
' <span class="num">2</span>\n'
' <p>September Term, 1754.</p>\n'
' </div>\n'
' <div class="num" id="p3">\n'
' <span class="num">3</span>\n'
' <p class="indent">Adjudged by the Court, that the Statute of '
'Frauds and Perjuries<a class="footnote" href="#fn-s" '
'id="fn-s_ref">*</a> does not extend to this Province, though made '
"before Mr. Penn's Charter: The Governor of New-York having exercised "
'A Jurisdiction here, before the making that Statute, by Virtue of '
'the Word Territories, in the Grant to the Duke of York, of New-York '
'and New-Jersey.</p>\n'
' </div>\n'
' <div class="footnotes">\n'
' <div class="footnote" id="fn-s">\n'
' <a class="footnote" href="#fn-s_ref">*</a>\n'
' <p> 29.Car.2.c.3. This statute was supplied, however by an '
'act of General Assembly passed the 12 Geo. 3. 31. 1 State Laws 462. '
'and sec 2 P. Will. 75.</p>\n'
' </div>\n'
' </div>\n'
' ',
'html_lawbox': '',
'html_with_citations': '<p class="case_cite"><span class="citation '
'no-link"><span class="volume">1</span> <span '
'class="reporter">U.S.</span> <span '
'class="page">1</span></span></p>\n'
' <p class="case_cite">1 Dall. 1</p>\n'
' <p class="case_cite">1 L.Ed. 11</p>\n'
' <p class="parties">Anonymous.</p>\n'
' <p class="court">Supreme Court of '
'Pennsylvania</p>\n'
' <p class="date">September Term, 1754.</p>\n'
' <p><span class="citation no-link"><span '
'class="volume">1</span> <span '
'class="reporter">U.S.</span> <span '
'class="page">1</span></span></p>\n'
' <p>1 Dall. 1</p>\n'
' <p>1 L.Ed. 11</p>\n'
' <div class="num" id="p1">\n'
' <span class="num">1</span>\n'
' <p>Anonymous.</p>\n'
' </div>\n'
' <p>Supreme Court of Pennsylvania</p>\n'
' <div class="num" id="p2">\n'
' <span class="num">2</span>\n'
' <p>September Term, 1754.</p>\n'
' </div>\n'
' <div class="num" id="p3">\n'
' <span class="num">3</span>\n'
' <p class="indent">Adjudged by the Court, that '
'the Statute of Frauds and Perjuries<a '
'class="footnote" href="#fn-s" id="fn-s_ref">*</a> '
'does not extend to this Province, though made before '
"Mr. Penn's Charter: The Governor of New-York having "
'exercised A Jurisdiction here, before the making that '
'Statute, by Virtue of the Word Territories, in the '
'Grant to the Duke of York, of New-York and '
'New-Jersey.</p>\n'
' </div>\n'
' <div class="footnotes">\n'
' <div class="footnote" id="fn-s">\n'
' <a class="footnote" href="#fn-s_ref">*</a>\n'
' <p> 29.Car.2.c.3. This statute was supplied, '
'however by an act of General Assembly passed the 12 '
'Geo. 3. 31. 1 State Laws 462. and sec 2 P. Will. '
'75.</p>\n'
' </div>\n'
' </div>\n'
' ',
'id': 84581,
'judges': '',
'local_path': None,
'nature_of_suit': '',
'plain_text': '',
'precedential_status': 'Published',
'resource_uri': '/api/rest/v2/document/84581/',
'sha1': 'd01a5ea90493b2357922308f2456c5d1479737f4',
'source': 'R',
'supreme_court_db_id': None,
'time_retrieved': '2010-04-28T08:59:25'} record is now a dictionary,so can access data easy.
>>> record['date_filed']
'1754-09-01'
>>> record['citation']['federal_cite_one']
'1 U.S. 1' Looking at data so do dictionary make more sense to work with this data than csv would do.
Can also take into Pandas,if need to do more manipulation of data.
Once inside a DataFrame then can also output easy to csv with df.to_csv() .
snippsat,
I really like the idea of making the raw github .json file a python dictionary.
>>> record['date_filed']
'1754-09-01'
>>> record['citation']['federal_cite_one']
'1 U.S. 1'
Are the above >>> (<--- Is that a Python console command line) and then you typed "record['date_filed'] and it outputs "'1754-09-01' ?
If I am understanding correctly. The dictionary variables such as record['date_filed'] & all the rest would need to be sent to a data frame within python and then from that dataframe could send it to .csv using function "df.to_csv()". This same process could work from dataframe to MySQL/MariaDB?
Is Pandas a must? Or moving the data from one format to another sufficient without using Pandas. I notice you say it's useful for those who would require more data manipulation.
Are there ways of using dataframes in python without using Pandas; or is that recommended for something like this?
Does pandas have df.to_csv() and df.to_mysql() ?
Thank you a bunch! Giving me a bit of hope on figuring this out one day!
Posts: 7,324
Threads: 123
Joined: Sep 2016
Mar-24-2020, 01:03 AM
(This post was last modified: Mar-24-2020, 01:06 AM by snippsat.)
(Mar-23-2020, 11:15 PM)BrandonKastning Wrote: Are the above >>> (<--- Is that a Python console command line) and then you typed "record['date_filed'] and it outputs "'1754-09-01' ? >>> is the interactive active shell that come with all Python version,i use a better one ptpython.
This is very basic knowledge about Python,that you may struggle with
Running as file .py it would look like this,then need to use print() .
import requests
from pprint import pprint
url = 'https://raw.githubusercontent.com/brianwc/bulk_scotus/master/1700s/1754/84581.json'
response = requests.get(url)
record = response.json()
#pprint(record)
print(record['date_filed'])
print(record['citation']['federal_cite_one']) Output: 1754-09-01
1 U.S. 1
(Mar-23-2020, 11:15 PM)BrandonKastning Wrote: Does pandas have df.to_csv() and df.to_mysql() ? Both to_csv and to_sql
(Mar-23-2020, 11:15 PM)BrandonKastning Wrote: Is Pandas a must? Or moving the data from one format to another sufficient without using Pandas. I notice you say it's useful for those who would require more .
Are there ways of using dataframes in python without using Pandas; or is that recommended for something like this? No,it can make data manipulation easier depend on the task that shall do with the data.
Can just use the dictionary and eg write those chosen values to a database.
There are also easier DB like TinyDB or dataset.
Demo dataset which has a lot power as build on top of SQLAlchemy.
import requests
from pprint import pprint
import dataset
url = 'https://raw.githubusercontent.com/brianwc/bulk_scotus/master/1700s/1754/84581.json'
response = requests.get(url)
record = response.json()
#pprint(record)
print(record['date_filed'])
print(record['citation']['federal_cite_one'])
#--- DB
db = dataset.connect('sqlite:///court.db')
table = db['court_table']
table.insert(dict(record=record['date_filed'], case=100))
table.insert(dict(record=record['citation']['federal_cite_one'], case=25) Test.
λ ptpython -i court.py
>>> table.find_one(case=25)
OrderedDict([('id', 2), ('record', '1 U.S. 1'), ('case', 25)])
# can also to do SQL queries
>>> result = db.query('SELECT * FROM court_table;')
>>> for row in result:
... print(row)
OrderedDict([('id', 1), ('record', '1754-09-01'), ('case', 100)])
OrderedDict([('id', 2), ('record', '1 U.S. 1'), ('case', 25)])
Posts: 89
Threads: 26
Joined: Mar 2020
Apr-19-2020, 05:18 AM
(This post was last modified: Apr-19-2020, 05:21 AM by BrandonKastning.)
snippsat,
Like you said this is very basic Python that is difficult for me as a newbie. I agree. Could I ask a few things to help me understand how to use what's being brought into python a little better until I fully understand how to read and modify the existing code?
for instance;
print(record['date_filed'])
print(record['citation']['federal_cite_one'])
How would you assign the value of 'date_filed' to
pythonvariable_date_filed = print(record['date_filed']) # DEFINED #
pythonvariable_citation = print(record['citation']['federal_cite_one']) # DEFINED #
I don't think I am understanding correctly.
(Mar-24-2020, 01:03 AM)snippsat Wrote: (Mar-23-2020, 11:15 PM)BrandonKastning Wrote: Are the above >>> (<--- Is that a Python console command line) and then you typed "record['date_filed'] and it outputs "'1754-09-01' ? >>> is the interactive active shell that come with all Python version,i use a better one ptpython.
This is very basic knowledge about Python,that you may struggle with
Running as file .py it would look like this,then need to use print() .
import requests
from pprint import pprint
url = 'https://raw.githubusercontent.com/brianwc/bulk_scotus/master/1700s/1754/84581.json'
response = requests.get(url)
record = response.json()
#pprint(record)
print(record['date_filed'])
print(record['citation']['federal_cite_one']) Output: 1754-09-01
1 U.S. 1
(Mar-23-2020, 11:15 PM)BrandonKastning Wrote: Does pandas have df.to_csv() and df.to_mysql() ? Both to_csv and to_sql
(Mar-23-2020, 11:15 PM)BrandonKastning Wrote: Is Pandas a must? Or moving the data from one format to another sufficient without using Pandas. I notice you say it's useful for those who would require more .
Are there ways of using dataframes in python without using Pandas; or is that recommended for something like this? No,it can make data manipulation easier depend on the task that shall do with the data.
Can just use the dictionary and eg write those chosen values to a database.
There are also easier DB like TinyDB or dataset.
Demo dataset which has a lot power as build on top of SQLAlchemy.
import requests
from pprint import pprint
import dataset
url = 'https://raw.githubusercontent.com/brianwc/bulk_scotus/master/1700s/1754/84581.json'
response = requests.get(url)
record = response.json()
#pprint(record)
print(record['date_filed'])
print(record['citation']['federal_cite_one'])
#--- DB
db = dataset.connect('sqlite:///court.db')
table = db['court_table']
table.insert(dict(record=record['date_filed'], case=100))
table.insert(dict(record=record['citation']['federal_cite_one'], case=25) Test.
λ ptpython -i court.py
>>> table.find_one(case=25)
OrderedDict([('id', 2), ('record', '1 U.S. 1'), ('case', 25)])
# can also to do SQL queries
>>> result = db.query('SELECT * FROM court_table;')
>>> for row in result:
... print(row)
OrderedDict([('id', 1), ('record', '1754-09-01'), ('case', 100)])
OrderedDict([('id', 2), ('record', '1 U.S. 1'), ('case', 25)])
Basically how to take the dictionary that was once the .json file and turn it into a dictionary like the python program does using import requests.
How to assign a python variable to each record in the dictionary?
I want to learn this prior to passing it over to dataset to make it easier for me to understand.
|