I have a mongodb data like below:
code date num price money
0 2 2015-11-15 10 3.8 -38.0
1 2 2015-11-17 -10 3.7 37.0
2 2 2015-11-20 20 3.5 -70.0
3 2 2016-04-01 10 3.2 -32.0
4 2 2016-04-02 -30 3.6 108.0
5 2 2016-04-03 50 3.4 -170.0
6 2 2016-11-01 -40 3.5 140.0
7 3 2015-02-01 25 7.0 -175.0
8 3 2015-05-01 35 7.5 -262.5
9 3 2016-03-01 -15 8.0 120.0
10 5 2015-11-20 50 5.0 -250.0
11 5 2016-06-01 -50 5.5 275.0
12 6 2015-02-01 35 11.5 -402.5
I want to get the number of securities held and the funds currently occupied by the securities If I take out the data, I can get the result I want in the following way:
import pandas as pd
import numpy as np
df=pd.DataFrame({'code': [2,2,2,2,2,2,2,3,3,3,5,5,6],
'date': ['2015-11-15','2015-11-17','2015-11-20','2016-04-01','2016-04-02','2016-04-03','2016-11-01','2015-02-01','2015-05-01','2016-03-01','2015-11-20','2016-06-01','2015-02-01'],
'num' : [10,-10, 20, 10, -30,50, -40, 25, 35, -15, 50, -50, 35],
'price': [3.8,3.7,3.5,3.2, 3.6,3.4, 3.5, 7, 7.5, 8, 5, 5.5, 11.5],
'money': [-38,37,-70,-32, 108,-170, 140,-175,-262.5,120,-250, 275,-402.5]
})
print(df,"\n------------------------------------------\n")
df['hold'] = df.groupby(['code'])['num'].cumsum()
df['type'] = np.where(df['hold'] > 0, 'B', 'S')
df['total']=df['total1']= df.groupby(['code'])['money'].cumsum()
def abc(dfg):
if dfg[dfg['hold'] == 0]['hold'].count():
subT = dfg[dfg['hold'] == 0]['total1'].iloc[-1]
dfg['total'] = np.where(dfg['hold'] > 0, dfg['total']-subT, dfg['total'])
return dfg
dfR = df.groupby(['code'], as_index=False)\
.apply(abc) \
.drop(['type', 'total1'], axis=1) \
.reset_index(drop=True)
df1=dfR.groupby(['code']).tail(1)
print(df1,"\n------------------------------------------\n")
out
code date num price money *hold* *total*
6 2 2016-11-01 -40 3.5 140.0 *10* *-30.0*
9 3 2016-03-01 -15 8.0 120.0 *45* *-317.5*
11 5 2016-06-01 -50 5.5 275.0 *0* *25.0*
12 6 2015-02-01 35 11.5 -402.5 *35* *-402.5*
If use the mongodb method (such as aggregate, or other), how can i directly obtain the same result as above?