[split] Rearranging CSV columns and rows

Ivan1 · (This post was last modified: Aug-30-2017, 07:33 PM by metulburr.)

Thank you so much - it worked! I followed your exact instructions and it worked perfectly. I now have both values for 'entityCommonStockSharesOutstanding' written in CSV instead of just one. Again thank you so much!

The only thing is that the populated cells in each row of the CSV file do not match the CSV's column headers (as you correctly indicated that this would happen in your earlier post).

For example, where there are two values for 'entityCommonStockSharesOutstanding', then one value will be in a cell in same column as the 'EntityCommonStockSharesOutstanding' header, whilst the other would be written to next cell on right and this falls under the 'DocumentCreationDate' column header.

I tried to fix this by adding one more 'EntityCommonStockSharesOutstanding' in the code as below:

ratiofile.write( "TradingSymbol,EntityRegistrantName,EntityCentralIndexKey,CurrentFiscalYearEndDate,DocumentType,DocumentPeriodEndDate,DocumentFiscalYearFocus,DocumentFiscalPeriodFocus,EntityCommonStockSharesOutstanding,EntityCommonStockSharesOutstanding,DocumentCreationDate,CommonStockSharesIssued,CommonStockSharesIssued,TreasuryStockShares,CommonStockSharesOutstanding\n" )

And it does fix the problem, but for the other security (different row in CSV), as it does not have two values for 'entityCommonStockSharesOutstanding', it writes the values to different header columns (for example: 'commonStockSharesIssued' ends up being written under the 'DocumentCreationDate' header column)

It would be great to know how to fix this - but actually, more importantly, I am now trying to learn how I can write to CSV (one cell only) the SUM of All Values that are available under the 'entityCommonStockSharesOutstanding'.

For example, if the below code writes to CSV the following values for 'entityCommonStockSharesOutstanding': 5 and 2, in the same row, but in different cells, then is there a way I can modify the code to return the SUM, being 7, as the 'entityCommonStockSharesOutstanding' and have this value of 7 written to one cell only?

I have enclosed the full script below as it stands now:

# Copyright 2014 Altova GmbH
# 
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# 
#	  http
# 
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import os, re, sys
from altova import xml, xsd, xbrl

sec_ns =  '/dei/'		# was: 
fasb_ns = '/us-gaap/'	# was: 

# We need to implement a simple locking class that allows us to avoid having 
# on_xbrl_valid called from multiple threads in RaptorXML at the same time.

if sys.platform == 'win32':
	import msvcrt

	# fcntl is not available under Windows, so we'll use a 
	# file locking function from the msvcrt library instead...

	class Lock:

		def __init__(self, filename):
			self.filename = filename
			# This will create it if it does not exist already
			self.handle = open(filename, 'w')

		# Bitwise OR fcntl.LOCK_NB if you need a non-blocking lock
		def acquire(self):
			msvcrt.locking(self.handle.fileno(), msvcrt.LK_LOCK, 1)

		def release(self):
			msvcrt.locking(self.handle.fileno(), msvcrt.LK_UNLCK, 1)

		def __del__(self):
			self.handle.close()

else:
	import fcntl

	# Under Linux and MacOS we can use the fcntl library to implement
	# the simple file locking mechanism...

	class Lock:

		def __init__(self, filename):
			self.filename = filename
			# This will create it if it does not exist already
			self.handle = open(filename, 'w')

		# Bitwise OR fcntl.LOCK_NB if you need a non-blocking lock
		def acquire(self):
			fcntl.flock(self.handle, fcntl.LOCK_EX)

		def release(self):
			fcntl.flock(self.handle, fcntl.LOCK_UN)

		def __del__(self):
			self.handle.close()

def camelToSpaces( label ):
	# Utility for pretty-printing the labels
	s1 = re.sub('(.)([A-Z][a-z]+)', r'\1 \2', label)
	return re.sub('([a-z0-9])([A-Z])', r'\1 \2', s1)

def factFinder( instance, namespace, label ):
	# Locate facts in the instance document by namespace and label, ignoring facts that have a context with a segment_element
	l = []
	for f in instance.items:
		if f.qname.namespace_name.find( namespace ) and f.qname.local_name == label:
			l.append( f )
	if not l:
		print('### NOT FOUND', label, type(l))
	return l
	
def printFacts( facts, indent=1, targetDate=None ):
    my_list = []
    for fact in facts:
        if targetDate==None or fact.context.period.instant == targetDate:
            if fact.concept.item_type==fact.concept.MONETARY_ITEM_TYPE:
                my_list.append(fact.effective_numeric_value)
                #print( indent * "\t", camelToSpaces( fact.qname.local_name ).ljust(100-indent*8), "$", '{}'.format( factValue ) )
            else:
                my_list.append(fact.normalized_value)
                #print( indent * "\t", camelToSpaces( fact.qname.local_name ).ljust(100-indent*8), factValue )
    return my_list
 
def on_xbrl_valid( job, instance ):
 
    try:
        # a portable solution to get the tmp dir
        import tempfile
        tmp = tempfile.gettempdir()
        tmplk = os.path.join( tmp, "extract_ratios_lock.tmp" )
        lock = Lock(tmplk)
        lock.acquire()
 
        # Create output CSV file if it doesn't exist yet
        if not os.path.isfile( "ratios.csv" ):
            with open("ratios.csv", "a") as ratiofile:
                	ratiofile.write( "TradingSymbol,EntityRegistrantName,EntityCentralIndexKey,CurrentFiscalYearEndDate,DocumentType,DocumentPeriodEndDate,DocumentFiscalYearFocus,DocumentFiscalPeriodFocus,EntityCommonStockSharesOutstanding,EntityCommonStockSharesOutstanding,DocumentCreationDate,CommonStockSharesIssued,CommonStockSharesIssued,TreasuryStockShares,CommonStockSharesOutstanding\n" )
                	ratiofile.close()
 
   		# Extract some basic facts from the filing, such as the effective end-date for balance sheet etc.
        tradingSymbol = factFinder( instance, sec_ns, "TradingSymbol" )
        entityRegistrantName = factFinder( instance, sec_ns, "EntityRegistrantName" )
        entityCentralIndexKey = factFinder( instance, sec_ns, "EntityCentralIndexKey" )
        currentFiscalYearEndDate = factFinder( instance, sec_ns, "CurrentFiscalYearEndDate" )
        documentType = factFinder( instance, sec_ns, "DocumentType" )
        documentPeriodEndDate = factFinder( instance, sec_ns, "DocumentPeriodEndDate" )
        documentFiscalYearFocus = factFinder( instance, sec_ns, "DocumentFiscalYearFocus" )
        documentFiscalPeriodFocus = factFinder( instance, sec_ns, "DocumentFiscalPeriodFocus" )
       	entityCommonStockSharesOutstanding = factFinder( instance, sec_ns, "EntityCommonStockSharesOutstanding" )
        documentCreationDate = factFinder( instance, sec_ns, "DocumentCreationDate" )
        commonStockSharesIssued = factFinder( instance, sec_ns, "CommonStockSharesIssued" )
        treasuryStockShares = factFinder( instance, sec_ns, "TreasuryStockShares" )
        commonStockSharesOutstanding = factFinder( instance, sec_ns, "CommonStockSharesOutstanding" )
        docEndDate = "2017-07-31"
        if len(documentPeriodEndDate) > 0:
            docEndDate = documentPeriodEndDate[0].normalized_value
 
        # Print information about filing and entity
        print( "Document and Entity Information:" )
        tradingSymbol = printFacts( tradingSymbol ) if tradingSymbol else ""
        entityRegistrantName = printFacts( entityRegistrantName ) if entityRegistrantName else ""
        entityCentralIndexKey = printFacts( entityCentralIndexKey ) if entityCentralIndexKey else ""
        currentFiscalYearEndDate = printFacts( currentFiscalYearEndDate ) if currentFiscalYearEndDate else ""
        docType = printFacts( documentType )
        documentPeriodEndDate = printFacts( documentPeriodEndDate ) if documentPeriodEndDate else ""
        documentFiscalYearFocus = printFacts( documentFiscalYearFocus ) if documentFiscalYearFocus else ""
        documentFiscalPeriodFocus = printFacts( documentFiscalPeriodFocus ) if documentFiscalPeriodFocus else ""
        entityCommonStockSharesOutstanding = printFacts( entityCommonStockSharesOutstanding, 3 ) if entityCommonStockSharesOutstanding else ""
        documentCreationDate = printFacts( documentCreationDate ) if documentCreationDate else ""
        commonStockSharesIssued = printFacts( commonStockSharesIssued, 3, docEndDate ) if commonStockSharesIssued else ""
       	treasuryStockShares = printFacts( treasuryStockShares, 3, docEndDate ) if treasuryStockShares else ""
        commonStockSharesOutstanding = printFacts( commonStockSharesOutstanding, 3, docEndDate ) if commonStockSharesOutstanding else ""
 
        # Append ratios to a CSV file for further analysis
        docType = ','.join(docType)
        tradingSymbol = ','.join(tradingSymbol)
        entityRegistrantName = ','.join(entityRegistrantName)
        entityCentralIndexKey = ','.join(entityCentralIndexKey)
        currentFiscalYearEndDate = ','.join(currentFiscalYearEndDate)
        documentPeriodEndDate = ','.join(documentPeriodEndDate) 
        documentFiscalYearFocus = ','.join(documentFiscalYearFocus) 
        documentFiscalPeriodFocus = ','.join(documentFiscalPeriodFocus) 
        entityCommonStockSharesOutstanding = ','.join(entityCommonStockSharesOutstanding)
        commonStockSharesIssued = ','.join(commonStockSharesIssued)
        treasuryStockShares = ','.join(treasuryStockShares)
        commonStockSharesOutstanding = ','.join(commonStockSharesOutstanding)
        
        with open("ratios.csv", "a") as ratiofile:
        	buff = '{},{},{},{},{},{},{},{},{},{},{},{},{}\n'.format(tradingSymbol + "," + entityRegistrantName + "," + entityCentralIndexKey + ","
                                                          + currentFiscalYearEndDate + ","
                                                          + docType + "," + documentPeriodEndDate + "," + documentFiscalYearFocus + ","
                                                          + documentFiscalPeriodFocus + ","
                                                          + entityCommonStockSharesOutstanding + "," + documentCreationDate + ","
                                                          + commonStockSharesIssued + "," + treasuryStockShares + ","
                                                          + commonStockSharesOutstanding)
        	ratiofile.write(buff)
 
    finally:
        lock.release()

Ivan1 · Aug-31-2017, 01:41 AM

I managed to find how to sum the values for 'entityCommonStockSharesOutstanding'. I needed to convert the string to integers and then sum it up. Below is the code that worked:

entityCommonStockSharesOutstanding = sum(list(map(int, entityCommonStockSharesOutstanding)))

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Converting a json file to a dataframe with rows and columns	eyavuz21	13	4,392	Jan-29-2023, 03:59 PM Last Post: eyavuz21
	How to properly format rows and columns in excel data from parsed .txt blocks	jh67	7	1,872	Dec-12-2022, 08:22 PM Last Post: jh67
	Check DataFrames with different sorting in columns and rows	Foxyskippy	0	771	Nov-19-2022, 07:49 AM Last Post: Foxyskippy
	Split single column to multiple columns	SriRajesh	1	1,319	Jan-07-2022, 06:43 PM Last Post: jefsummers
	The code I have written removes the desired number of rows, but wrong rows	Jdesi1983	0	1,625	Dec-08-2021, 04:42 AM Last Post: Jdesi1983
	making variables in my columns and rows in python	kronhamilton	2	1,609	Oct-31-2021, 10:38 AM Last Post: snippsat
	rows from sql query need to write to a file as columns	sjcsvatt	6	2,382	Oct-09-2021, 12:45 AM Last Post: snippsat
	Merging spreadsheets with the same columns and extracting rows with matching entries	johnbernard	3	9,412	Aug-19-2021, 03:08 PM Last Post: johnbernard
	Summing up rows and columns	plumberpy	3	2,260	Aug-18-2021, 05:46 AM Last Post: naughtyCat
	Pandas DataFrame combine rows by column value, where Date Rows are NULL	rhat398	0	2,105	May-04-2021, 10:51 PM Last Post: rhat398

[split] Rearranging CSV columns and rows

User Panel Messages

Announcements