Python Forum

Full Version: Extract values from array
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
How do we extract values from Array
Hi,
I'm new to Python and need your help. Can someone please help?
I have the below code where I want to extract the Sprint values from the JSON format using python. Please refer the highlighted code in 'Red'. Custom field 98765 has 3 Sprints with their corresponding states. I want to extract
a) first Sprint, Sprint 2 and the state 'Closed'
b) last Sprint, Sprint 4 and the state 'Active'
There can be any number of Sprints. In the below example, I have 3 Sprints (Sprint_2, Sprint_3, Sprint_4) but there can be 0 or more Sprints.

===========================================================================================================


{'expand': 'operations,editmeta,changelog,transitions,renderedFields', 'id': '123456',
 'fields': {
    'issuetype': {
       'id': '123',
       'description': '',
       'iconUrl': 'https://issuetracking.test.net/jira00/images/icons/issuetypes/test.png',
       'name': 'Story',
       'subtask': False,
       'self': 'https://issuetracking.test.net/jira00/rest/api/2/issuetype/123'},
       'customfield_98765': [
              3d9387[rapidViewId=1006,state=CLOSED,name=Sprint_2,startDate=2017-02-15T07:01:14.040-05:00,endDate=2017-02-28T07:01:00.000-05:00,completeDate=2017-03-01T09:47:50.906-05:00,sequence=34567,id=34567]
              39888[rapidViewId=1006,state=CLOSED,name=Sprint_3,startDate=2017-03-01T09:53:57.981-05:00,endDate=2017-03-14T09:53:00.000-04:00,completeDate=2017-03-15T08:45:12.519-04:00,sequence=45678,id=45678],
               34drw0c[rapidViewId=1006,state=ACTIVE,name=Sprint_4,startDate=2017-03-15T03:38:24.957-04:00,endDate=2017-03-28T03:38:00.000-04:00,completeDate=<null>,sequence=13579,id=13579]},
 'self': 'https://issuetracking.test.net/jira00/rest/api/latest/issue/123456',
 'key': 'ABCDEF-67890'}
===================================================================================================

Thanks in advance.

Moderator Larz60+: Added Python tags. Please do this in the future (see help, BBCODE)
import json


with open('YourFileName.json') as f:
   data = json.load(f)
print(data)
do this, then show results and I'll explain how to access field data
arrays are called lists in python.
there are many ways to extract data from an array.
You provide nothing!
what type of data is in your list?
what is the structure?
etc.
(Apr-17-2017, 05:04 AM)Larz60+ Wrote: [ -> ]
 import json with open('YourFileName.json') as f:    data = json.load(f) print(data) 
do this, then show results and I'll explain how to access field data

I'm connecting to JIRA using REST API and have the results stored in an object.

result = requests.get(searchUrl % ( jira_instance, fields, jira_filter ), auth=(username, password))
jsonobj = result.json()
print(jsonobj)
#when I print jsonobj I get the result I posted in my original query.
Your json object is a nested dictionary, probably with some lists, you can access dictionary values using keys, while lists are indexed with numbers. From print output it is hard to see what exactly your jsonobj is, but you can try
jsonobj['fields']['customfield_98765']
It probably returns list of Sprints, if so, you can try to access first one with
jsonobj['fields']['customfield_98765'][0]
and last one with
jsonobj['fields']['customfield_98765'][-1]
If it doesnt work, post output of repr(jsonobj).
(Apr-17-2017, 08:33 AM)zivoni Wrote: [ -> ]Your json object is a nested dictionary, probably with some lists, you can access dictionary values using keys, while lists are indexed with numbers. From print output it is hard to see what exactly your jsonobj is, but you can try
 jsonobj['fields']['customfield_98765'] 
It probably returns list of Sprints, if so, you can try to access first one with
 jsonobj['fields']['customfield_98765'][0] 
and last one with
 jsonobj['fields']['customfield_98765'][-1] 
If it doesnt work, post output of repr(jsonobj).

Hi Zivoni,

Thanks for your response. Let me explain the problem in more detail. Hope that helps...
Here's my code...


inst = 'jira00'
filt = 'project in (xyz)'
fields = 'key,issuetype,customfield_98765,customfield_12345'

searchUrl = 'https://issuetracking.jpmchase.net/%s/rest/api/latest/search?fields==%s&maxResults=-1&jql=%s'
result = requests.get(searchUrl % ( inst, fields, filt ))
jsonobj = result.json()
print(jsonobj)

#==============================================================================================================================================================================================================================

#When i print jsonobj, it prints the below:
#{'expand': 'operations,editmeta,changelog,transitions,renderedFields', 'id': '123456',
#  'fields': {
#     'issuetype': {
#        'id': '123',
#        'description': '',
#        'iconUrl': 'https://issuetracking.test.net/jira00/images/icons/issuetypes/test.png',
#        'name': 'Story',
#        'subtask': False,
#        'self': 'https://issuetracking.test.net/jira00/rest/api/2/issuetype/123'},
#       'customfield_98765': [
#              'com.atlassian.service.sprint.Sprint3d9387[rapidViewId=1006,state=CLOSED,name=Sprint_2,startDate=2017-02-15T07:01:14.040-05:00,endDate=2017-02-28T07:01:00.000-05:00,completeDate=2017-03-01T09:47:50.906-05:00,sequence=34567,id=34567']
#'com.atlassian.service.sprint.Sprint39888[rapidViewId=1006,state=CLOSED,name=Sprint_3,startDate=2017-03-01T09:53:57.981-05:00,endDate=2017-03-14T09:53:00.000-04:00,completeDate=2017-03-15T08:45:12.519-04:00,sequence=45678,id=45678'],
#'com.atlassian.service.sprint.Sprint34drw0c[rapidViewId=1006,state=ACTIVE,name=Sprint_4,startDate=2017-03-15T03:38:24.957-04:00,endDate=2017-03-28T03:38:00.000-04:00,completeDate=<null>,sequence=13579,id=13579]'},
#  'self': 'https://issuetracking.test.net/jira00/rest/api/latest/issue/123456',
#  'key': 'ABCDEF-67890'}

#==============================================================================================================================================================================================================================
opt = []

for key in jsonobj['issues']:
   row = {}
   row['issueType'] =key['fields']['issuetype']['name']
row['link'] =key['fields']['customfield_12345']
   row['key'] =key['key']
   row['sprint'] =key['fields']['customfield_98765']
   opt.append(row)

print(opt)
#==============================================================================================================================================================================================================================

#When i print opt file, it displays the below rows
{'link': None, 'sprint': ['com.atlassian.greenhopper.service.sprint.Sprint@123ed5s[rapidViewId=1234,state=CLOSED,name=Sprint_2,startDate=2017-03-29T10:36:54.327-04:00,endDate=2017-04-11T10:36:00.000-04:00,completeDate=2017-04-12T12:25:51.156-04:00,sequence=12345,id=12345]', 'com.atlassian.greenhopper.service.sprint.Sprint@5rd3d5gd[rapidViewId=1234,state=ACTIVE,name=Sprint_3,startDate=2017-04-12T01:34:47.270-04:00,endDate=2017-04-25T01:34:00.000-04:00,completeDate=<null>,sequence=67890,id=67890]'], 'issueType': 'Story', 'key': 'ABCDEF-55555'}
The problem is that as part of 'sprint', it displays the complete list whereas i want to extract Sprint_2 and Sprint_3 only i.e. the first sprint and the last sprint from the list of 'sprint'. I can use 2 variables, startSprint and lastSprint and store the values Sprint_2 and Sprint_3 in each respectively.
#==============================================================================================================================================================================================================================

Moderator Larz60+: Added Python tags. Please do this in the future (see help, BBCODE)
Regular expressions?

>>> data = {'link': None, 'sprint': ['com.atlassian.greenhopper.service.sprint.Sprint@123ed5s[rapidViewId=1234,state=CLOSED,name=Sprint_2,startDate=2017-03-29T10:36:54.327-04:00,endDate=2017-04-11T10:36:00.000-04:00,completeDate=2017-04-12T12:25:51.156-04:00,sequence=12345,id=12345]', 'com.atlassian.greenhopper.service.sprint.Sprint@5rd3d5gd[rapidViewId=1234,state=ACTIVE,name=Sprint_3,startDate=2017-04-12T01:34:47.270-04:00,endDate=2017-04-25T01:34:00.000-04:00,completeDate=<null>,sequence=67890,id=67890]'], 'issueType': 'Story', 'key': 'ABCDEF-55555'}
>>> sprint = data['sprint']
>>> for item in sprint:
...   print(item)
...
com.atlassian.greenhopper.service.sprint.Sprint@123ed5s[rapidViewId=1234,state=CLOSED,name=Sprint_2,startDate=2017-03-29T10:36:54.327-04:00,endDate=2017-04-11T10:36:00.000-04:00,completeDate=2017-04-12T12:25:51.156-04:00,sequence=12345,id=12345]
com.atlassian.greenhopper.service.sprint.Sprint@5rd3d5gd[rapidViewId=1234,state=ACTIVE,name=Sprint_3,startDate=2017-04-12T01:34:47.270-04:00,endDate=2017-04-25T01:34:00.000-04:00,completeDate=<null>,sequence=67890,id=67890]
>>> import re
>>> regex = re.compile(r"state=(.+),name=Sprint_[23]")
>>> for item in sprint:
...     match = regex.search(item)
...     if match:
...         print(match.groups()[0])
...     else:
...         print("no match")
...
CLOSED
ACTIVE
(Apr-18-2017, 03:52 AM)nilamo Wrote: [ -> ]Regular expressions?
>>> data = {'link': None, 'sprint': ['com.atlassian.greenhopper.service.sprint.Sprint@123ed5s[rapidViewId=1234,state=CLOSED,name=Sprint_2,startDate=2017-03-29T10:36:54.327-04:00,endDate=2017-04-11T10:36:00.000-04:00,completeDate=2017-04-12T12:25:51.156-04:00,sequence=12345,id=12345]', 'com.atlassian.greenhopper.service.sprint.Sprint@5rd3d5gd[rapidViewId=1234,state=ACTIVE,name=Sprint_3,startDate=2017-04-12T01:34:47.270-04:00,endDate=2017-04-25T01:34:00.000-04:00,completeDate=<null>,sequence=67890,id=67890]'], 'issueType': 'Story', 'key': 'ABCDEF-55555'} >>> sprint = data['sprint'] >>> for item in sprint: ...   print(item) ... com.atlassian.greenhopper.service.sprint.Sprint@123ed5s[rapidViewId=1234,state=CLOSED,name=Sprint_2,startDate=2017-03-29T10:36:54.327-04:00,endDate=2017-04-11T10:36:00.000-04:00,completeDate=2017-04-12T12:25:51.156-04:00,sequence=12345,id=12345] com.atlassian.greenhopper.service.sprint.Sprint@5rd3d5gd[rapidViewId=1234,state=ACTIVE,name=Sprint_3,startDate=2017-04-12T01:34:47.270-04:00,endDate=2017-04-25T01:34:00.000-04:00,completeDate=<null>,sequence=67890,id=67890] >>> import re >>> regex = re.compile(r"state=(.+),name=Sprint_[23]") >>> for item in sprint: ...     match = regex.search(item) ...     if match: ...         print(match.groups()[0]) ...     else: ...         print("no match") ... CLOSED ACTIVE


Thanks a ton, Nilamo. This worked.