Posts: 37
Threads: 13
Joined: Apr 2018
May-06-2019, 05:16 PM
(This post was last modified: May-06-2019, 05:16 PM by parthi1705.)
Having huge no of records in original requirement.
Here a small peace of sample data.
results = [
(Decimal('28000.0000'), 2010, 272), (Decimal('308000.0000'), 2011, 272),
(Decimal('582000.0000'), 2012, 272), (Decimal('387000.0000'), 2013, 272),
(Decimal('367000.0000'), 2010, 281), (Decimal('3037000.0000'), 2011, 281),
(Decimal('4822000.0000'), 2012, 281), (Decimal('2936000.0000'), 2013, 281),
(Decimal('637000.0000'), 2010, 282), (Decimal('3440000.0000'), 2011, 282),
(Decimal('478000.0000'), 2010, 296), (Decimal('1209000.0000'), 2012, 296)
]
#Desired output
CalendarYear = [2010,2011,2012,2013]
Salesamount = [[28000.00,308000.00,582000.00,387000.00],[367000.00,3037000.00,4822000.00,2936000.00],[637000.00,3440000.00,0,0],[478000.00,0,1209000.00,0]]
Employee = [272,281,282,296] How to add zeros or null to Salesamount list by forming sublist as here each sublist should have 4 records.
For the year 2012 and 2013 data is not there for Employee 282 and for the year 2011 and 2013 data is not there for Employee 296 , so i need to add 0 or NULL .
Got some google info, but it is hardcode value, where as mine is a dynamic one.
https://stackoverflow.com/questions/2144...ain-length]
Also this is different from
https://python-forum.io/Thread-Split-Lis...rom-Pyodbc
as here we are splitting data not adding zeros or null.
For better understanding having single/ same set of data to work around.
Posts: 2,343
Threads: 62
Joined: Sep 2016
Posts: 37
Threads: 13
Joined: Apr 2018
Since my data is from pyodbc source i have modified my data there.
Have created database view with left join which has null value.
Posts: 37
Threads: 13
Joined: Apr 2018
May-10-2019, 11:55 AM
(This post was last modified: May-10-2019, 11:55 AM by parthi1705.)
Have tried some thing like below but it did not return my result
Had my data in pandas and tried, but no luck
>>> df
SalesamountQuota CalendarYear Employeekey
0 28000.0 2010 272
1 308000.0 2011 272
2 582000.0 2012 272
3 387000.0 2013 272
4 367000.0 2010 281
5 3037000.0 2011 281
6 4822000.0 2012 281
7 2936000.0 2013 281
8 637000.0 2010 282
9 3440000.0 2011 282
10 478000.0 2010 296
11 12090000.0 2012 296
df_emp =df[['CalendarYear','Employeekey']]
emp=df['Employeekey'].unique()
empnew=pd.DataFrame(data=emp.flatten())
empnew.columns =['Employeekey']
for i in empnew:
pd.merge(df_emp,df,left_on='Employeekey', right_on='CalendarYear',how='right')
Posts: 37
Threads: 13
Joined: Apr 2018
Is there a way to fill zero's?
Posts: 1,939
Threads: 8
Joined: Jun 2018
It's probably not the answer you looking for, but "the way to fill zero's" in case of iterables is:
>>> from itertools import zip_longest
>>> a = [1, 2, 3]
>>> b = [10, 20, 30, 40]
>>> list(zip_longest(a, b, fillvalue=0))
[(1, 10), (2, 20), (3, 30), (0, 40)]
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Posts: 37
Threads: 13
Joined: Apr 2018
From Database i am able to do it by adding Zeros , unable to do it in python.
Select * from #FactSalesQuota
SalesamountQuota CalendarYear Employeekey
28000.00 2010 272
308000.00 2011 272
582000.00 2012 272
387000.00 2013 272
367000.00 2010 281
3037000.00 2011 281
4822000.00 2012 281
2936000.00 2013 281
637000.00 2010 282
3440000.00 2011 282
478000.00 2010 296
1209000.00 2012 296 CREATE TABLE #CalendarSales_final (SalesamountQuota DECIMAL(20,4),CalendarYear INT,Employeekey INT)
SELECT Distinct Identity(INT,1,1) Rid,EmployeeKey INTO #temp FROM FactSalesQuota
SELECT DISTINCT Calendaryear INTO #dates FROM FactSalesQuota
DECLARE @i INT,@j INT=1,@Emp INT
SELECT @i =Rid FROM #temp
WHILE @j<=@i
Begin
SELECT @emp=EmployeeKey FROM #temp where Rid=@j
INSERT INTO #CalendarSales_final
SELECT ISNULL(C.salesamountQuota,0)SalesamountQuota, d.*,@emp FROM FactSalesQuota c Right join #Dates as d ON c.Calendaryear=d.Calendaryear
and EmployeeKey in (SELECT t. EmployeeKey FROM #temp as t WHERE Rid=@j)
Set @j=@j+1
end
SELECT * FROM #CalendarSales_final
Posts: 37
Threads: 13
Joined: Apr 2018
Any suggestions how can we do it in python by filling zeros.
Posts: 37
Threads: 13
Joined: Apr 2018
https://stackoverflow.com/questions/2144...ain-length
The above link adds zero with default position, were as my requirement is different from the above how to add zero's in my case.
Any suggestion would be highly appreciated.
Posts: 95
Threads: 3
Joined: May 2019
May-28-2019, 11:30 AM
(This post was last modified: May-28-2019, 12:15 PM by michalmonday.)
I added pandas way too, trying to make it a little more concise, it would be cool if some pandas wizard would post an optimal solution just to check how good it can possibly look like
def Decimal(x):
return x
results = [
(Decimal('28000.0000'), 2010, 272), (Decimal('308000.0000'), 2011, 272),
(Decimal('582000.0000'), 2012, 272), (Decimal('387000.0000'), 2013, 272),
(Decimal('367000.0000'), 2010, 281), (Decimal('3037000.0000'), 2011, 281),
(Decimal('4822000.0000'), 2012, 281), (Decimal('2936000.0000'), 2013, 281),
(Decimal('637000.0000'), 2010, 282), (Decimal('3440000.0000'), 2011, 282),
(Decimal('478000.0000'), 2010, 296), (Decimal('1209000.0000'), 2012, 296)
]
#Desired output
years = sorted(set([x[1] for x in results]))
employees = sorted(set([x[2] for x in results]))
sales = []
for emp in employees:
# get all the results for that particular employee
emp_results = [x for x in results if x[2] == emp]
# init/fill emp_sales with as many 0's as there are years
emp_sales = [0] * len(years)
for year_index, year in enumerate(years):
for result_year, result_sales_amount in [(x[1], x[0]) for x in emp_results]:
if year == result_year:
emp_sales[year_index] = result_sales_amount
sales.append(emp_sales)
print('years:\n', years, '\n')
print('employees:\n', employees, '\n')
print('sales:\n', sales, '\n') Output: years:
[2010, 2011, 2012, 2013]
employees:
[272, 281, 282, 296]
sales:
[['28000.0000', '308000.0000', '582000.0000', '387000.0000'], ['367000.0000', '3037000.0000', '4822000.0000', '2936000.0000'], ['637000.0000', '3440000.0000', 0, 0], ['478000.0000', 0, '1209000.0000', 0]]
import pandas as pd
def Decimal(x):
return x
results = [
(Decimal('28000.0000'), 2010, 272), (Decimal('308000.0000'), 2011, 272),
(Decimal('582000.0000'), 2012, 272), (Decimal('387000.0000'), 2013, 272),
(Decimal('367000.0000'), 2010, 281), (Decimal('3037000.0000'), 2011, 281),
(Decimal('4822000.0000'), 2012, 281), (Decimal('2936000.0000'), 2013, 281),
(Decimal('637000.0000'), 2010, 282), (Decimal('3440000.0000'), 2011, 282),
(Decimal('478000.0000'), 2010, 296), (Decimal('1209000.0000'), 2012, 296)
]
pd.set_option('display.float_format', '{:.2f}'.format)
df = pd.DataFrame(results, columns = ['sales' , 'year', 'employee'])
years = df['year'].unique()
employees = df['employee'].unique()
new_df = pd.DataFrame(index = employees, columns = years)
for emp in employees:
emp_df = df.loc[df['employee'] == emp]
new_df.loc[emp, : ] = [float(emp_df.loc[emp_df['year']==year]['sales']) if year in list(emp_df['year']) else 0 for year in years]
print(new_df) Output: 2010 2011 2012 2013
272 28000.00 308000.00 582000.00 387000.00
281 367000.00 3037000.00 4822000.00 2936000.00
282 637000.00 3440000.00 0 0
296 478000.00 0 1209000.00 0
|