How to change from printFacts ( ) to return a list & Loop over list when writing CSV - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: How to change from printFacts ( ) to return a list & Loop over list when writing CSV (/thread-4564.html) Pages:
1
2
|
How to change from printFacts ( ) to return a list & Loop over list when writing CSV - Ivan1 - Aug-27-2017 I am new to Python and have modified below code to return various data points including below data point for each company. However, in cases where there are two values for the same data point, the code only writes one value to CSV. For example, the following will be returned on my Mac terminal: Entity Common Stock Shares Outstanding 70852076 Entity Common Stock Shares Outstanding 3900903065 Common Stock Shares Issued 71000000 Common Stock Shares Issued 3986000000 But the CSV file only contains: Entity Common Stock Shares Outstanding 3900903065 Common Stock Shares Issued 3986000000 CSV file does not contain: Entity Common Stock Shares Outstanding 70852076 Common Stock Shares Issued 71000000 I have been told that this is because printFacts() prints all the facts that match the given target date, but it only returns the last one, and that's the one that gets written to the CSV file. I have been advised to change it to return a list, and then loop over the list when writing to the CSV. The problem is that I do not know how to change the code to return a list and then to loop over the list as am new to Python and coding in general. I have been spending hours trying to do this but no luck. Appreciate if someone can show me what I need to change and where - thank you. # Copyright 2014 Altova GmbH # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. import os, re, sys from altova import xml, xsd, xbrl sec_ns = '/dei/' # was: 'http:' fasb_ns = '/us-gaap/' # was: 'http:' # We need to implement a simple locking class that allows us to avoid having # on_xbrl_valid called from multiple threads in RaptorXML at the same time. if sys.platform == 'win32': import msvcrt # fcntl is not available under Windows, so we'll use a # file locking function from the msvcrt library instead... class Lock: def __init__(self, filename): self.filename = filename # This will create it if it does not exist already self.handle = open(filename, 'w') # Bitwise OR fcntl.LOCK_NB if you need a non-blocking lock def acquire(self): msvcrt.locking(self.handle.fileno(), msvcrt.LK_LOCK, 1) def release(self): msvcrt.locking(self.handle.fileno(), msvcrt.LK_UNLCK, 1) def __del__(self): self.handle.close() else: import fcntl # Under Linux and MacOS we can use the fcntl library to implement # the simple file locking mechanism... class Lock: def __init__(self, filename): self.filename = filename # This will create it if it does not exist already self.handle = open(filename, 'w') # Bitwise OR fcntl.LOCK_NB if you need a non-blocking lock def acquire(self): fcntl.flock(self.handle, fcntl.LOCK_EX) def release(self): fcntl.flock(self.handle, fcntl.LOCK_UN) def __del__(self): self.handle.close() def camelToSpaces( label ): # Utility for pretty-printing the labels s1 = re.sub('(.)([A-Z][a-z]+)', r'\1 \2', label) return re.sub('([a-z0-9])([A-Z])', r'\1 \2', s1) def factFinder( instance, namespace, label ): # Locate facts in the instance document by namespace and label, ignoring facts that have a context with a segment_element l = [] for f in instance.items: if f.qname.namespace_name.find( namespace ) and f.qname.local_name == label: l.append( f ) if not l: print('### NOT FOUND', label, type(l)) return l def printFacts( facts, indent=1, targetDate=None ): # Find the fact for the relevant target date and print it factValue = 0 for fact in facts: if targetDate==None or fact.context.period.instant == targetDate: if fact.concept.item_type==fact.concept.MONETARY_ITEM_TYPE: factValue = fact.effective_numeric_value print( indent * "\t", camelToSpaces( fact.qname.local_name ).ljust(100-indent*8), "$", '{0:>16,}'.format( factValue ) ) else: factValue = fact.normalized_value print( indent * "\t", camelToSpaces( fact.qname.local_name ).ljust(100-indent*8), factValue ) return factValue def on_xbrl_valid( job, instance ): try: # a portable solution to get the tmp dir import tempfile tmp = tempfile.gettempdir() tmplk = os.path.join( tmp, "extract_ratios_lock.tmp" ) lock = Lock(tmplk) lock.acquire() # Create output CSV file if it doesn't exist yet if not os.path.isfile( "ratios.csv" ): with open("ratios.csv", "a") as ratiofile: ratiofile.write( "TradingSymbol,EntityRegistrantName,EntityCentralIndexKey,CurrentFiscalYearEndDate,DocumentType,DocumentPeriodEndDate,DocumentFiscalYearFocus,DocumentFiscalPeriodFocus,EntityCommonStockSharesOutstanding,DocumentCreationDate,CommonStockSharesIssued,TreasuryStockShares,CommonStockSharesOutstanding\n" ) ratiofile.close() # Extract some basic facts from the filing, such as the effective end-date for balance sheet etc. tradingSymbol = factFinder( instance, sec_ns, "TradingSymbol" ) entityRegistrantName = factFinder( instance, sec_ns, "EntityRegistrantName" ) entityCentralIndexKey = factFinder( instance, sec_ns, "EntityCentralIndexKey" ) currentFiscalYearEndDate = factFinder( instance, sec_ns, "CurrentFiscalYearEndDate" ) documentType = factFinder( instance, sec_ns, "DocumentType" ) documentPeriodEndDate = factFinder( instance, sec_ns, "DocumentPeriodEndDate" ) documentFiscalYearFocus = factFinder( instance, sec_ns, "DocumentFiscalYearFocus" ) documentFiscalPeriodFocus = factFinder( instance, sec_ns, "DocumentFiscalPeriodFocus" ) entityCommonStockSharesOutstanding = factFinder( instance, sec_ns, "EntityCommonStockSharesOutstanding" ) documentCreationDate = factFinder( instance, sec_ns, "DocumentCreationDate" ) commonStockSharesIssued = factFinder( instance, sec_ns, "CommonStockSharesIssued" ) treasuryStockShares = factFinder( instance, sec_ns, "TreasuryStockShares" ) commonStockSharesOutstanding = factFinder( instance, sec_ns, "CommonStockSharesOutstanding" ) docEndDate = "2017-07-31" if len(documentPeriodEndDate) > 0: docEndDate = documentPeriodEndDate[0].normalized_value # Print information about filing and entity print( "Document and Entity Information:" ) tradingSymbol = printFacts( tradingSymbol ) if tradingSymbol else "" entityRegistrantName = printFacts( entityRegistrantName ) if entityRegistrantName else "" entityCentralIndexKey = printFacts( entityCentralIndexKey ) if entityCentralIndexKey else "" currentFiscalYearEndDate = printFacts( currentFiscalYearEndDate ) if currentFiscalYearEndDate else "" docType = printFacts( documentType ) documentPeriodEndDate = printFacts( documentPeriodEndDate ) if documentPeriodEndDate else "" documentFiscalYearFocus = printFacts( documentFiscalYearFocus ) if documentFiscalYearFocus else "" documentFiscalPeriodFocus = printFacts( documentFiscalPeriodFocus ) if documentFiscalPeriodFocus else "" entityCommonStockSharesOutstanding = printFacts( entityCommonStockSharesOutstanding, 3 ) if entityCommonStockSharesOutstanding else "" documentCreationDate = printFacts( documentCreationDate ) if documentCreationDate else "" commonStockSharesIssued = printFacts( commonStockSharesIssued, 3, docEndDate ) if commonStockSharesIssued else "" treasuryStockShares = printFacts( treasuryStockShares, 3, docEndDate ) if treasuryStockShares else "" commonStockSharesOutstanding = printFacts( commonStockSharesOutstanding, 3, docEndDate ) if commonStockSharesOutstanding else "" # Append ratios to a CSV file for further analysis with open("ratios.csv", "a") as ratiofile: ratiofile.write( tradingSymbol + "," + entityRegistrantName + "," + entityCentralIndexKey + "," + currentFiscalYearEndDate + "," + docType + "," + documentPeriodEndDate + "," + documentFiscalYearFocus + "," + documentFiscalPeriodFocus + "," + entityCommonStockSharesOutstanding + "," + documentCreationDate + "," + commonStockSharesIssued + "," + treasuryStockShares + "," + commonStockSharesOutstanding + "\n") ratiofile.close() finally: lock.release() RE: How to change from printFacts ( ) to return a list & Loop over list when writing CSV - Larz60+ - Aug-27-2017 obviously, both statements can't be true if relating to the same security. Is the code you show the original, or modified code? You should show (and label) both so that we can see where your modifications fail. RE: How to change from printFacts ( ) to return a list & Loop over list when writing CSV - Ivan1 - Aug-27-2017 The code I posted is modified to extract, print and write to CSV additional data points that the orginal code did not cover (like for example: Entity Common Stock Shares Outstanding"). I have enclosed the original code below. # Copyright 2014 Altova GmbH # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http: # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. import os, re, sys from altova import xml, xsd, xbrl sec_ns = '/dei/' # was: 'http: fasb_ns = '/us-gaap/' # was: 'http: # We need to implement a simple locking class that allows us to avoid having # on_xbrl_valid called from multiple threads in RaptorXML at the same time. if sys.platform == 'win32': import msvcrt # fcntl is not available under Windows, so we'll use a # file locking function from the msvcrt library instead... class Lock: def __init__(self, filename): self.filename = filename # This will create it if it does not exist already self.handle = open(filename, 'w') # Bitwise OR fcntl.LOCK_NB if you need a non-blocking lock def acquire(self): msvcrt.locking(self.handle.fileno(), msvcrt.LK_LOCK, 1) def release(self): msvcrt.locking(self.handle.fileno(), msvcrt.LK_UNLCK, 1) def __del__(self): self.handle.close() else: import fcntl # Under Linux and MacOS we can use the fcntl library to implement # the simple file locking mechanism... class Lock: def __init__(self, filename): self.filename = filename # This will create it if it does not exist already self.handle = open(filename, 'w') # Bitwise OR fcntl.LOCK_NB if you need a non-blocking lock def acquire(self): fcntl.flock(self.handle, fcntl.LOCK_EX) def release(self): fcntl.flock(self.handle, fcntl.LOCK_UN) def __del__(self): self.handle.close() def camelToSpaces( label ): # Utility for pretty-printing the labels s1 = re.sub('(.)([A-Z][a-z]+)', r'\1 \2', label) return re.sub('([a-z0-9])([A-Z])', r'\1 \2', s1) def factFinder( instance, namespace, label ): # Locate facts in the instance document by namespace and label, ignoring facts that have a context with a segment_element l = [] for f in instance.items: if f.qname.namespace_name.find( namespace ) and f.qname.local_name == label: segment = None try: entElement = f.context.entity.element for childElement in entElement.children: if childElement.local_name=="segment": segment = childElement except: pass if segment==None: l.append( f ) return l def printFacts( facts, indent=1, targetDate=None ): # Find the fact for the relevant target date and print it factValue = 0 for fact in facts: if targetDate==None or fact.context.period.instant == targetDate: if fact.concept.item_type==fact.concept.MONETARY_ITEM_TYPE: factValue = fact.effective_numeric_value print( indent * "\t", camelToSpaces( fact.qname.local_name ).ljust(100-indent*8), "$", '{0:>16,}'.format( factValue ) ) else: factValue = fact.normalized_value print( indent * "\t", camelToSpaces( fact.qname.local_name ).ljust(100-indent*8), factValue ) return factValue def on_xbrl_valid( job, instance ): try: # a portable solution to get the tmp dir import tempfile tmp = tempfile.gettempdir() tmplk = os.path.join( tmp, "extract_ratios_lock.tmp" ) lock = Lock(tmplk) lock.acquire() # Create output CSV file if it doesn't exist yet if not os.path.isfile( "ratios.csv" ): with open("ratios.csv", "a") as ratiofile: ratiofile.write( "DocumentType,EntityName,CIK,PeriodEndDate,CurrentRatio,QuickRatio,CashRatio\n" ) ratiofile.close() # Extract some basic facts from the filing, such as the effective end-date for balance sheet etc. docEndDate = "2013-12-31" documentType = factFinder( instance, sec_ns, "DocumentType" ) documentFiscalYearFocus = factFinder( instance, sec_ns, "DocumentFiscalYearFocus" ) documentFiscalPeriodFocus = factFinder( instance, sec_ns, "DocumentFiscalPeriodFocus" ) documentPeriodEndDate = factFinder( instance, sec_ns, "DocumentPeriodEndDate" ) if len(documentPeriodEndDate) > 0: docEndDate = documentPeriodEndDate[0].normalized_value # Extract Filer Name and other key data entityRegistrantName = factFinder( instance, sec_ns, "EntityRegistrantName" ) entityCentralIndexKey = factFinder( instance, sec_ns, "EntityCentralIndexKey" ) entityCommonStockSharesOutstanding = factFinder( instance, sec_ns, "EntityCommonStockSharesOutstanding" ) # Print information about filing and entity print( "Document and Entity Information:" ) docType = printFacts( documentType ) entityName = printFacts( entityRegistrantName ) entityCIK = printFacts( entityCentralIndexKey ) printFacts( documentPeriodEndDate ) printFacts( documentFiscalPeriodFocus ) printFacts( documentFiscalYearFocus ) if docType=="10-K" or docType=="10-Q": # Now let's calculate some useful ratios from the balance sheet print( "Analytical Ratios:" ) print( "\tBalance Sheet:" ) # Current Ratio currentRatio = 0 print( "\t\tCurrent Ratio = Current Assets / Current Liabilities:" ) currentAssetsFacts = factFinder( instance, fasb_ns, "AssetsCurrent" ) currentLiabilitiesFacts = factFinder( instance, fasb_ns, "LiabilitiesCurrent" ) currentAssets = printFacts( currentAssetsFacts, 3, docEndDate ) currentLiabilities = printFacts( currentLiabilitiesFacts, 3, docEndDate ) if not currentLiabilities==0: currentRatio = currentAssets / currentLiabilities print( 3 * "\t", "Current Ratio = ".ljust(100-3*8), '{0:.2f}'.format( currentRatio ) ) # Quick Ratio quickRatio = 0 print( "\t\tQuick Ratio = ( Cash + Short-Term Marketable Securities + Accounts Receivable ) / Current Liabilities:" ) cashFacts = factFinder( instance, fasb_ns, "Cash" ) if len(cashFacts)==0: cashFacts = factFinder( instance, fasb_ns, "CashAndCashEquivalentsAtCarryingValue" ) if len(cashFacts)==0: cashFacts = factFinder( instance, fasb_ns, "CashCashEquivalentsAndShortTermInvestments" ) marketableSecuritiesFacts = factFinder( instance, fasb_ns, "MarketableSecuritiesCurrent" ) if len(marketableSecuritiesFacts)==0: marketableSecuritiesFacts = factFinder( instance, fasb_ns, "AvailableForSaleSecuritiesCurrent" ) if len(marketableSecuritiesFacts)==0: marketableSecuritiesFacts = factFinder( instance, fasb_ns, "ShortTermInvestments" ) if len(marketableSecuritiesFacts)==0: marketableSecuritiesFacts = factFinder( instance, fasb_ns, "OtherShortTermInvestments" ) accountsReceivableFacts = factFinder( instance, fasb_ns, "AccountsReceivableNetCurrent" ) currentLiabilitiesFacts = factFinder( instance, fasb_ns, "LiabilitiesCurrent" ) cash = printFacts( cashFacts, 3, docEndDate ) marketableSecurities = printFacts( marketableSecuritiesFacts, 3, docEndDate ) accountsReceivable = printFacts( accountsReceivableFacts, 3, docEndDate ) currentLiabilities = printFacts( currentLiabilitiesFacts, 3, docEndDate ) if not currentLiabilities==0: quickRatio = ( cash + marketableSecurities + accountsReceivable ) / currentLiabilities print( 3 * "\t", "Quick Ratio = ".ljust(100-3*8), '{0:.2f}'.format( quickRatio ) ) # Cash Ratio cashRatio = 0 print( "\t\tCash Ratio = ( Cash + Short-Term Marketable Securities ) / Current Liabilities:" ) cash = printFacts( cashFacts, 3, docEndDate ) marketableSecurities = printFacts( marketableSecuritiesFacts, 3, docEndDate ) currentLiabilities = printFacts( currentLiabilitiesFacts, 3, docEndDate ) if not currentLiabilities==0: cashRatio = ( cash + marketableSecurities ) / currentLiabilities print( 3 * "\t", "Cash Ratio = ".ljust(100-3*8), '{0:.2f}'.format( cashRatio ) ) # Append ratios to a CSV file for further analysis with open("ratios.csv", "a") as ratiofile: ratiofile.write( docType + ',"' + entityName + '",' + entityCIK + "," + docEndDate + "," + '{0:.2f}'.format( currentRatio ) + "," + '{0:.2f}'.format( quickRatio ) + "," + '{0:.2f}'.format( cashRatio ) + "\n" ) ratiofile.close() finally: lock.release() Note, I should add that for the above security (Ford) there are two values for Entity Common Stock Shares Outstanding as one is for Common Stock and other is for Class B stock but as per SEC's website the XBRL for Ford's 10 Q shows the above values on two different lines but both have same name: EntityCommomStockSharesOutstanding The modified code returns the values for both common stock and class B stock (both shown separately on my mac terminal screen under heading Entity Common Stock Shares Outstanding but the written CSV file only picks up one of the two values for Entity Common Stock Shares Oustanding RE: How to change from printFacts ( ) to return a list & Loop over list when writing CSV - Larz60+ - Aug-27-2017 This code is very hard to read. I'll come back to it sometime today when I have less to do. One thing i will mention on your modifications, it's OK to (in fact you should) break long statements like: ratiofile.write( docType + ',"' + entityName + '",' + entityCIK + "," + docEndDate + "," + '{0:.2f}'.format( currentRatio ) + "," + '{0:.2f}'.format( quickRatio ) + "," + '{0:.2f}'.format( cashRatio ) + "\n" )into multiple lines like: ratiofile.write( docType + ',"' + entityName + '",' + entityCIK + "," + docEndDate + "," + '{0:.2f}'.format( currentRatio ) + "," + '{0:.2f}'.format( quickRatio ) + "," + '{0:.2f}'.format( cashRatio ) + "\n" )And much easier to read. RE: How to change from printFacts ( ) to return a list & Loop over list when writing CSV - Ivan1 - Aug-28-2017 I have made the code more simple. But when you run it, I still get same problem. On my mac terminal I get the two different values for Entity Common Stock Shares Outstanding, but I only get one (of the two) values for Entity Common Stock Shares Outstanding in the CSV file. All I need is to get all the values that are returned on Mac terminal screen in the CSV file. Can someone help? Code below: import os, re, sys from altova import xml, xsd, xbrl sec_ns = '/dei/' # was: 'http: fasb_ns = '/us-gaap/' # was: 'http: # We need to implement a simple locking class that allows us to avoid having # on_xbrl_valid called from multiple threads in RaptorXML at the same time. if sys.platform == 'win32': import msvcrt # fcntl is not available under Windows, so we'll use a # file locking function from the msvcrt library instead... class Lock: def __init__(self, filename): self.filename = filename # This will create it if it does not exist already self.handle = open(filename, 'w') # Bitwise OR fcntl.LOCK_NB if you need a non-blocking lock def acquire(self): msvcrt.locking(self.handle.fileno(), msvcrt.LK_LOCK, 1) def release(self): msvcrt.locking(self.handle.fileno(), msvcrt.LK_UNLCK, 1) def __del__(self): self.handle.close() else: import fcntl # Under Linux and MacOS we can use the fcntl library to implement # the simple file locking mechanism... class Lock: def __init__(self, filename): self.filename = filename # This will create it if it does not exist already self.handle = open(filename, 'w') # Bitwise OR fcntl.LOCK_NB if you need a non-blocking lock def acquire(self): fcntl.flock(self.handle, fcntl.LOCK_EX) def release(self): fcntl.flock(self.handle, fcntl.LOCK_UN) def __del__(self): self.handle.close() l = tradingSymbol,entityRegistrantName,entityCentralIndexKey,currentFiscalYearEndDate,documentType,documentPeriodEndDate,documentFiscalYearFocus,documentFiscalPeriodFocus,entityCommonStockSharesOutstanding,documentCreationDate,commonStockSharesIssued,treasuryStockShares,commonStockSharesOutstanding with open("output.csv", "w") as csv_file: wr = csv.writer(csv_file) wr.write(csv_file) RE: How to change from printFacts ( ) to return a list & Loop over list when writing CSV - Larz60+ - Aug-28-2017 I finally got to this, but cannot run as I don't have the package altova. RE: How to change from printFacts ( ) to return a list & Loop over list when writing CSV - Ivan1 - Aug-28-2017 Thank you so much for you help. I spent 5 hours yesterday trying to make it work but no luck. I have Altova set up and can run the code and share the results with you. They offer their product for free for 30 day trial period; it is called Raptor XBRL + Server RE: How to change from printFacts ( ) to return a list & Loop over list when writing CSV - Ivan1 - Aug-29-2017 This is the feedback that I received from Altova's support team: "The instance may contain multiple values for a fact, the printFacts function returns the last one in such a case. If you want to write all values into one CSV field or a line per value into CSV you may modify printFacts or copy it to another function that returns all values and use that" But I don't know what exactly I need to modify in the printFacts function? RE: How to change from printFacts ( ) to return a list & Loop over list when writing CSV - wavic - Aug-29-2017 It's not necessary to put comments on every piece of code. In Python, you can put a multistring comment right under the function definition so you could use it lately to build a documentation. def func(): """This function doesn't do anything. It's just an example" pass RE: How to change from printFacts ( ) to return a list & Loop over list when writing CSV - metulburr - Aug-29-2017 (Aug-29-2017, 08:55 PM)Ivan1 Wrote: This is the feedback that I received from Altova's support team: I dont have this module as well....so take what i say with a grain of salt as i am rushing typing this as i head out the door..... factfinder() returns a list and sends it to printFacts() as "facts". Then this functions loops through them under conditions to get "factValue" Quote:factValue = fact.effective_numeric_valueSo based on this i would assume if you make a list and append fact.effective_numeric_value or fact.normalized_value (based on whatever you want) then you would get all the values instead of the last one. Then return the list instead of factValue. But most likely that is going ot break the code as i would assume somewhere it expects that to be an int, but will now get a list. |