Apr-13-2023, 03:43 PM
Split does not take out spaces. Split makes multiple splits a string into multiple substrings using a delimiter.
And what do you mean by this?
I copied the above to a file named text.txt and use pandas.read_csv to load it into a dataframe. This is a little tricky since there two separators in this file: an equal sign that may be preceeded or followed by space, and a space. This requires using a special separator what is a regular expression.
https://pandas.pydata.org/docs/reference...d_csv.html
Notice that the datatypes are all numbers without having to do any processing of the dataframe columns.
string = '123-456-7890' substrings = string.split('-') print(substrings)
Output:['123', '456', '7890']
So we now know that this makes no sense and you should never do it:results_df['MIC/RESULT'].str.split(' ').str[1]How about this? Does it ever make sense to do this?
results_df['MIC/RESULT'].replace(' ','')I made a dataframe that has strings that contain numbers and digits and I apply try your code.
import pandas as pd df = pd.DataFrame({"A": [" 1.23 ", "1 23"]}) df["B"] = df['A'].replace(' ','') print(df)
Output: A B
0 1.23 1.23
1 1 23 1 23
As expected, the replace code does nothing because none of the rows in "A" are a single blank. Is that what you wanted to happen? Find rows in MIC/RESULT that are a single space and replace that with an empty string?And what do you mean by this?
Quote:ID MIC1 MIC_FLOAT MIC_FOR_RANGESIs this a file that you are reading into a dataframe?
1 = 0.0625 0.0625 0.0625
2 =0.5 0.5000 0.5000
3 =0.0625 0.0625 0.0625
4 =0.0625 0.0625 0.0625
I copied the above to a file named text.txt and use pandas.read_csv to load it into a dataframe. This is a little tricky since there two separators in this file: an equal sign that may be preceeded or followed by space, and a space. This requires using a special separator what is a regular expression.
import pandas as pd df = pd.read_csv("test.txt", sep=r"\s*=\s*|\s+", engine="python") print(df) print(df.dtypes)
Output: ID MIC1 MIC_FLOAT MIC_FOR_RANGES
0 1 0.0625 0.0625 0.0625
1 2 0.5000 0.5000 0.5000
2 3 0.0625 0.0625 0.0625
3 4 0.0625 0.0625 0.0625
ID int64
MIC1 float64
MIC_FLOAT float64
MIC_FOR_RANGES float64
dtype: object
The sep argument is a regular expression that says the separator might be an equal sign with leading and trailing spaces (\s*=\s*) or it might be a space (\s). Since this appears to be a poorly formatted file (not always using the same formatting) I decided to expand the second separator to one or more spaces (\s+). I had to set the parser engine to "python" so I could use an expression for the separator.https://pandas.pydata.org/docs/reference...d_csv.html
Notice that the datatypes are all numbers without having to do any processing of the dataframe columns.