Posts: 4,802
Threads: 77
Joined: Jan 2018
Dec-23-2019, 05:13 PM
(This post was last modified: Dec-23-2019, 05:13 PM by Gribouillis.)
The __repr__ gives a nice representation of the object when you try to display it, for example
1 2 3 4 5 6 7 8 9 10 11 12 13 |
>>> class Statement:
... def __init__( self , data):
... self .data = data
... def __repr__( self ):
... return "{}({})" . format ( self .__class__.__name__, self .data)
...
>>> class Storey(Statement):
... pass
...
>>> element = Storey(( 'STOREY' , 'foo' , 'BAR' , 'baz' ))
>>>
>>> element
Storey(( 'STOREY' , 'foo' , 'BAR' , 'baz' ))
|
If I don't include the __repr__ in the class definition, the element is displayed as so
1 2 3 4 5 6 7 8 9 10 |
>>> class Statement:
... def __init__( self , data):
... self .data = data
...
>>> class Storey(Statement):
... pass
...
>>> element = Storey(( 'STOREY' , 'foo' , 'BAR' , 'baz' ))
>>> element
<__main__.Storey object at 0x7ff163ad3898 >
|
Using classes for the various statements will help you in the second step when you will transform these statements in the second language or in intermediary steps if you want to manipulate the data easily. To start with. you can simply define a subclass of Statement for each of the kinds of statements that you meet in the input file. For example if there are statements for doors, then use a class
1 2 |
class Door(Statement):
pass
|
These classes are empty for now but you will be free to add features to them later.
It would be a good idea to include an example of a typical input file if you can do that.
Posts: 13
Threads: 1
Joined: Dec 2019
Dec-24-2019, 11:55 AM
(This post was last modified: Dec-24-2019, 11:55 AM by kingsman.)
(Dec-23-2019, 05:13 PM)Gribouillis Wrote: These classes are empty for now but you will be free to add features to them later.
It would be a good idea to include an example of a typical input file if you can do that.
There is an obstacle again. I would like to deal with the materials first.
Here are the material statement in a software.
TABLE: "MATERIAL PROPERTIES 01 - GENERAL"
Material=4000Psi Type=Concrete SymType=Isotropic TempDepend=No Color=Magenta Notes="Customary f'c 4000 psi 23/12/2019 2:17:43 pm"
Material=A615Gr60 Type=Rebar SymType=Uniaxial TempDepend=No Color=White Notes="ASTM A615 Grade 60 23/12/2019 2:18:28 pm"
Material=A992Fy50 Type=Steel SymType=Isotropic TempDepend=No Color=Red Notes="ASTM A992 Grade 50 23/12/2019 2:17:43 pm"
Material=C30 Type=Concrete SymType=Isotropic TempDepend=No Color=Blue Notes="Concrete added 23/12/2019 2:18:37 pm"
Material=C45 Type=Concrete SymType=Isotropic TempDepend=No Color=Blue Notes="Concrete added 23/12/2019 2:20:37 pm"
Material=C60 Type=Concrete SymType=Isotropic TempDepend=No Color=Blue Notes="Concrete added 23/12/2019 2:21:13 pm"
TABLE: "MATERIAL PROPERTIES 02 - BASIC MECHANICAL PROPERTIES"
Material=4000Psi UnitWeight=2.40276966513304E-06 UnitMass=2.45014307299925E-10 E1=2534.56354148831 G12=1056.0681422868 U12=0.2 A1=0.0000099
Material=A615Gr60 UnitWeight=7.84904757236607E-06 UnitMass=8.0038007068661E-10 E1=20389.0191580383 A1=0.0000117
Material=A992Fy50 UnitWeight=7.84904757236607E-06 UnitMass=8.0038007068661E-10 E1=20389.0191580383 G12=7841.93044539935 U12=0.3 A1=0.0000117
Material=C30 UnitWeight=2.49830467094493E-06 UnitMass=2.54756172606745E-10 E1=2263.76994673377 G12=943.237477805739 U12=0.2 A1=0.0000099
Material=C45 UnitWeight=2.49830467094493E-06 UnitMass=2.54756172606745E-10 E1=2692.05074746719 G12=1121.68781144466 U12=0.2 A1=0.0000099
Material=C60 UnitWeight=2.49830467094493E-06 UnitMass=2.54756172606745E-10 E1=3059.14857666726 G12=1274.64524027803 U12=0.2 A1=0.0000099
The first three materials are default inside the software so I would not extract the information inside it. Also, a same material has divided into two table so I would like to deal with 'material properties 01' first.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
SAP = open ( 'Frame SAP.$2k' , 'r' )
class SAP_Material_Statement01:
def __init__( self , Name, Type , Symtype, Tempdepend, Colour, Notes):
self .Name = Name
self . Type = Type
self .Symtype = Symtype
self .Tempdepend = Tempdepend
self .Colour = Colour
self .Notes = Notes
def __repr__( self ):
return '{}( Material={} Type={} SymType={} TempDepend={} Color={} Notes="{}")' . format ( self .__class__.__name__, self .Name, self . Type , self .Symtype, self .Tempdepend, self .Colour, self .Notes)
test = SAP_Material_Statement01( 'C30' , 'Concrete' , 'Isotropic' , 'No' , 'Blue' , 'Concrete added 23/12/2019 2:18:37 pm' )
class material_(SAP_Material_Statement01):
pass
SAP_material_statement_01 = []
for line in SAP:
split_01 = line.split( '=' )
split_02 = line.split( ' ' )
if split_01[ 0 ] = = ' Material' and split_02[ 4 ] = = 'TempDepend=No' :
SAP_material_statement_01.append(line)
print (SAP_material_statement_01)
for line in SAP_material_statement_01:
print (line.split())
|
![[Image: 4sGzbKT]](https://ibb.co/4sGzbKT)
I have stopped in this step. I need all the things after '=' (e.g. C30, Concrete, Isotropic).
However, I do not know the next step
Posts: 4,802
Threads: 77
Joined: Jan 2018
Don't give too much structures to the classes at first. The priority is to parse the file. You will only add the necessary code in the classes when you want to actually do something with the data. Here the simple key=value form of the input makes it easy to parse with regular expressions. See this example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
import io
import re
SAP = io.StringIO(
)
class Statement:
def __init__( self , data):
self .data = data
def __repr__( self ):
return "{}({})" . format ( self .__class__.__name__, self .data)
class SAP_Material_01(Statement):
pass
parsed_file = []
for line in SAP:
L = re.split(r '(\w+)[=]' , line.strip())
assert L[ 0 ] = = ''
pairs = {}
for i in range ( 1 , len (L), 2 ):
pairs[L[i]] = L[i + 1 ].strip()
item = SAP_Material_01(pairs)
parsed_file.append(item)
for item in parsed_file:
print (item)
|
Output: SAP_Material_01({'TempDepend': 'No', 'Color': 'Magenta', 'SymType': 'Isotropic', 'Material': '4000Psi', 'Type': 'Concrete', 'Notes': '"Customary f\'c 4000 psi 23/12/2019 2:17:43 pm"'})
SAP_Material_01({'TempDepend': 'No', 'Color': 'White', 'SymType': 'Uniaxial', 'Material': 'A615Gr60', 'Type': 'Rebar', 'Notes': '"ASTM A615 Grade 60 23/12/2019 2:18:28 pm"'})
SAP_Material_01({'TempDepend': 'No', 'Color': 'Red', 'SymType': 'Isotropic', 'Material': 'A992Fy50', 'Type': 'Steel', 'Notes': '"ASTM A992 Grade 50 23/12/2019 2:17:43 pm"'})
SAP_Material_01({'TempDepend': 'No', 'Color': 'Blue', 'SymType': 'Isotropic', 'Material': 'C30', 'Type': 'Concrete', 'Notes': '"Concrete added 23/12/2019 2:18:37 pm"'})
SAP_Material_01({'TempDepend': 'No', 'Color': 'Blue', 'SymType': 'Isotropic', 'Material': 'C45', 'Type': 'Concrete', 'Notes': '"Concrete added 23/12/2019 2:20:37 pm"'})
SAP_Material_01({'TempDepend': 'No', 'Color': 'Blue', 'SymType': 'Isotropic', 'Material': 'C60', 'Type': 'Concrete', 'Notes': '"Concrete added 23/12/2019 2:21:13 pm"'})
Posts: 13
Threads: 1
Joined: Dec 2019
Dec-26-2019, 12:48 PM
(This post was last modified: Dec-26-2019, 12:48 PM by kingsman.)
(Dec-24-2019, 01:03 PM)Gribouillis Wrote:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
import io
import re
SAP = io.StringIO(
)
class Statement:
def __init__( self , data):
self .data = data
def __repr__( self ):
return "{}({})" . format ( self .__class__.__name__, self .data)
class SAP_Material_01(Statement):
pass
parsed_file = []
for line in SAP:
L = re.split(r '(\w+)[=]' , line.strip())
assert L[ 0 ] = = ''
pairs = {}
for i in range ( 1 , len (L), 2 ):
pairs[L[i]] = L[i + 1 ].strip()
item = SAP_Material_01(pairs)
parsed_file.append(item)
for item in parsed_file:
print (item)
|
I get what the code is doing here. However, there are many other information in the file.
The L[0] is not ''. L[0] is all the title of the tables. I can't get the data inside it.
Maybe I can show you all the data inside so that we can discuss clearer.
https://textuploader.com/1oyau
Posts: 4,802
Threads: 77
Joined: Jan 2018
Dec-26-2019, 05:50 PM
(This post was last modified: Dec-26-2019, 08:26 PM by Gribouillis.)
As @ buran said before, the input language is very structured. It is a list of tables which rows consist of pairs key/value. Apparently, a table row that terminates with an underscore _ means that the row is continued on the next line.
I created a code below that transform the input file into a python script containing simple structured data: A list of items each of which represents a table. The items are pairs with a table name and a list of rows. Each row is a list of pairs of python strings representing a key and a value.
Here is the code that does this transformation. It is short but it is not yet well documented. I suggest that you try it with the input files that you have to see if it works. Its name is base_parsing.py . I execute it in a terminal with the command
Output: python3 base_parsing.py PATH_TO_THE_INPUT_FILE
currently it prints is output to the console but you can redirect it to a file foo.py (I don't know how you do that in windows)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 |
from collections import namedtuple
from pprint import pformat
import re
import sys
TableRecord = namedtuple( 'TableRecord' , ( 'table_name' , 'rows' ))
def table_data_lines(infile):
for lineno, line in enumerate (infile, 1 ):
if line.startswith( 'END TABLE DATA' ):
break
line = line.strip()
if line:
yield lineno, line
def read_table(table_name, sequence, parsed_file):
name = None
rows = []
last_row = []
for lineno, line in sequence:
if line.startswith( 'TABLE:' ):
name = line[ 6 :].strip().strip(
)
while table_name:
table_name = read_table(table_name, sequence, parsed_file)
return parsed_file
def create_python_script(parsed_file, outfile = sys.stdout):
from functools import partial
print = partial(__builtins__. print , file = outfile)
print ( "from collections import namedtuple" )
print ( "\nTableRecord = namedtuple('TableRecord', ('table_name', 'rows'))" )
print ( '\nparsed_file = [' )
for tr in parsed_file:
print ( ' TableRecord(table_name={}, rows=[' . format (
repr (tr.table_name)))
for row in tr.rows:
print ( '{},' . format (pformat(row)))
print ( ']), # end of TableRecord' )
print ( '] # end of parsed_file' )
def main():
if len (sys.argv) ! = 2 :
print ( 'Error: Usage: program filename.$2k' )
sys.exit( - 1 )
with open (sys.argv[ 1 ]) as infile:
parsed_file = parse_file(infile)
create_python_script(parsed_file)
if __name__ = = '__main__' :
main()
|
The output is a python module that contains the data and that can be directly imported. That way, you can transform your data file into intermediary python files, which can be a huge step towards the translation
The result looks like this
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
from collections import namedtuple
TableRecord = namedtuple( 'TableRecord' , ( 'table_name' , 'rows' ))
parsed_file = [
TableRecord(table_name = 'ACTIVE DEGREES OF FREEDOM' , rows = [
[( 'UX' , 'Yes' ),
( 'UY' , 'Yes' ),
( 'UZ' , 'Yes' ),
( 'RX' , 'Yes' ),
( 'RY' , 'Yes' ),
( 'RZ' , 'Yes' )],
]),
TableRecord(table_name = 'ANALYSIS OPTIONS' , rows = [
[( 'Solver' , 'Advanced' ),
( 'SolverProc' , 'Auto' ),
( 'Force32Bit' , 'No' ),
( 'StiffCase' , 'None' ),
( 'GeomMod' , 'No' )],
]),
TableRecord(table_name = 'AUTO WAVE 3 - WAVE CHARACTERISTICS - GENERAL' , rows = [
[( 'WaveChar' , 'Default' ),
( 'WaveType' , '"From Theory"' ),
( 'KinFactor' , '1' ),
( 'SWaterDepth' , '45000' ),
( 'WaveHeight' , '18000' ),
( 'WavePeriod' , '12' ),
( 'WaveTheory' , 'Linear' )],
]),
...
...
...
[( 'RebarID' , 'N24' ),
( 'Area' , '452.00001881498' ),
( 'Diameter' , '24.0000003604438' )],
[( 'RebarID' , 'N28' ),
( 'Area' , '616.000025641654' ),
( 'Diameter' , '28.0000004205178' )],
[( 'RebarID' , 'N32' ),
( 'Area' , '804.000033467353' ),
( 'Diameter' , '32.0000004805918' )],
[( 'RebarID' , 'N36' ),
( 'Area' , '1020.00004245858' ),
( 'Diameter' , '36.0000005406658' )],
]),
]
|
If you save the result in a file foo.py, you can then directly import the data like
1 |
from foo import TableRecord, parsed_file
|
and you can then start working on producing the translated file.
Posts: 13
Threads: 1
Joined: Dec 2019
(Dec-26-2019, 05:50 PM)Gribouillis Wrote:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
from collections import namedtuple
from pprint import pformat
import re
import sys
TableRecord = namedtuple( 'TableRecord' , ( 'table_name' , 'rows' ))
def table_data_lines(infile):
for lineno, line in enumerate (infile, 1 ):
if line.startswith( 'END TABLE DATA' ):
break
line = line.strip()
if line:
yield lineno, line
def read_table(table_name, sequence, parsed_file):
name = None
rows = []
last_row = []
for lineno, line in sequence:
if line.startswith( 'TABLE:' ):
name = line[ 6 :].strip().strip( '"' )
break
continued = line.endswith( '_' )
if continued:
line = line[: - 1 ].rstrip()
L = re.split(r '(\w+)[=]' , line)
assert L[ 0 ] = = ''
last_row.extend((L[i], L[i + 1 ].strip()) for i in range ( 1 , len (L), 2 ))
if not continued:
rows.append(last_row)
last_row = []
parsed_file.append(TableRecord(table_name, rows))
return name
|
I get what the table_data_lines is doing but I don't know the things in read_table.
what is the parameter inside (table_name, sequence, parsed_file)?
Posts: 4,802
Threads: 77
Joined: Jan 2018
Dec-27-2019, 01:47 PM
(This post was last modified: Dec-27-2019, 01:47 PM by Gribouillis.)
kingsman Wrote:I get what the table_data_lines is doing but I don't know the things in read_table.
what is the parameter inside (table_name, sequence, parsed_file)? Well, 'table_name' is the name of the table to read. It is known because this function is called immediately after the parser has read a line starting with TABLE:. The 'sequence' argument is the sequence produced by table_data_line() , that is to say a sequence of pairs (lineno, line) read from the file. When the sequence is passed to read_table() , the lines that come next in the sequence are the rows of the table. The argument 'parsed_file' is the list to which we add the TableRecord 's that we produce. This list is created in the function parse_file() .
The function read_table() read lines in the file and create a TableRecord which rows are extracted from these lines. It stops when it meets a line starting with TABLE:, which indicates the beginning of a new table. In that case, it returns the name of the next table.
I ignored the line File C:... at the beginning of the file that you linked. Is this line a part of the file? In that case, we may need to modify parse_file() in order to ignore all the lines that come before the first TABLE:... line.
Posts: 13
Threads: 1
Joined: Dec 2019
(Dec-27-2019, 01:47 PM)Gribouillis Wrote: The function read_table() read lines in the file and create a TableRecord which rows are extracted from these lines. It stops when it meets a line starting with TABLE:, which indicates the beginning of a new table. In that case, it returns the name of the next table.
I ignored the line File C:... at the beginning of the file that you linked. Is this line a part of the file? In that case, we may need to modify parse_file() in order to ignore all the lines that come before the first TABLE:... line.
Yes, File C:... at the beginning of the file is part of the file.
I am a beginner of Python still absorbing the knowledge inside it.
This is my final year project, why it is so difficult
Posts: 4,802
Threads: 77
Joined: Jan 2018
Dec-27-2019, 02:15 PM
(This post was last modified: Dec-27-2019, 02:15 PM by Gribouillis.)
Here is the modified parse_file() that skips the lines before the first TABLE:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
def parse_file(infile):
parsed_file = []
sequence = table_data_lines(infile)
for lineno, line in sequence:
if line.startswith( 'TABLE:' ):
break
else :
return parsed_file
table_name = line[ 6 :].strip().strip( '"' )
while table_name:
table_name = read_table(table_name, sequence, parsed_file)
return parsed_file
|
It is important that you try to parse several input files in order to discover potential issues that we haven't seen yet in the parsing phase.
Posts: 13
Threads: 1
Joined: Dec 2019
I have done something similar to yours and extract the information as you said before. Is this also be okay?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
import re
SAP = open ( 'Frame SAP.$2k' , 'r' )
class Statement:
def __init__( self , data):
self .data = data
def __repr__( self ):
return "{}({})" . format ( self .__class__.__name__, self .data)
class SAP_Material_01 (Statement):
pass
parsed_file = []
material_data_01 = []
for line in SAP:
e = re. compile (r '\s{3}Material=\w+\s{3}Type=\w+\s{3}' )
d = re.match(e, line)
if d ! = None :
parsed_file.append(line)
for line in parsed_file:
L = re.split(r '(\w+)[=]' , line.strip())
assert L[ 0 ] = = ''
pairs = {}
for i in range ( 1 , len (L), 2 ):
pairs[L[i]] = L[i + 1 ].strip()
item = SAP_Material_01(pairs)
material_data_01.append(item)
for item in material_data_01:
print (item)
|
Output: SAP_Material_01({'Material': '4000Psi', 'Type': 'Concrete', 'SymType': 'Isotropic', 'TempDepend': 'No', 'Color': 'Magenta', 'Notes': '"Customary f\'c 4000 psi 23/12/2019 2:17:43 pm"'})
SAP_Material_01({'Material': 'A615Gr60', 'Type': 'Rebar', 'SymType': 'Uniaxial', 'TempDepend': 'No', 'Color': 'White', 'Notes': '"ASTM A615 Grade 60 23/12/2019 2:18:28 pm"'})
SAP_Material_01({'Material': 'A992Fy50', 'Type': 'Steel', 'SymType': 'Isotropic', 'TempDepend': 'No', 'Color': 'Red', 'Notes': '"ASTM A992 Grade 50 23/12/2019 2:17:43 pm"'})
SAP_Material_01({'Material': 'C30', 'Type': 'Concrete', 'SymType': 'Isotropic', 'TempDepend': 'No', 'Color': 'Blue', 'Notes': '"Concrete added 23/12/2019 2:18:37 pm"'})
SAP_Material_01({'Material': 'C45', 'Type': 'Concrete', 'SymType': 'Isotropic', 'TempDepend': 'No', 'Color': 'Blue', 'Notes': '"Concrete added 23/12/2019 2:20:37 pm"'})
SAP_Material_01({'Material': 'C60', 'Type': 'Concrete', 'SymType': 'Isotropic', 'TempDepend': 'No', 'Color': 'Blue', 'Notes': '"Concrete added 23/12/2019 2:21:13 pm"'})
|