clean crsp pandas

Solutions on MaxInterview for clean crsp pandas by the best coders in the world

showing results for - "clean crsp pandas"
Liah
24 Sep 2016
1# How to clean CRSP the WRDS database, to calculate Mkt Cap on non financial US firms
2
3CRSP = CRSP[CRSP['SHRCLS'].isin(['NaN', 'A'])] # Keep Share class A or missing 
4CRSP = CRSP[CRSP['SHRCD'].isin(['10', '11'])] # Keep only sharecode of 10 and 11
5CRSP['RET'] = CRSP['RET'].replace(['C','B'],np.nan) # Clean return taking out strings 
6
7# Keep value above -50 to avoid any errors 
8CRSP['RET'] = CRSP['RET'].astype('float')
9mask = CRSP['RET'] > -50
10CRSP = CRSP[mask]
11
12CRSP['PRC'] = CRSP['PRC'].abs() # Keep absolute value of Price 
13
14# Calculate Market value with adjustment in Price and Market Value
15CRSP['market_value'] = ((CRSP['PRC']/CRSP['CFACPR'])*(CRSP['SHROUT']*CRSP['CFACSHR'])).shift(1)
16
17# Take out SIC code with letter z and keep only rows with a SIC 
18mask_z = CRSP['SICCD'] == 'Z'
19CRSP['SICCD'] = CRSP['SICCD'][-mask_z]
20CRSP['SICCD'] = CRSP['SICCD'].dropna().astype(int)
21
22# Keep non financial firms only
23CRSP = CRSP[~CRSP['SICCD'].between(6000,6999)]
24
25# Change date type from object to datetime and create new colums for month and year
26CRSP['date'] = pd.to_datetime(CRSP['date'])
27CRSP['year'] = pd.DatetimeIndex(CRSP['date']).year
28CRSP['month'] = pd.DatetimeIndex(CRSP['date']).month
29
30# Clean CRSP
31CRSP.drop(['SHRCD','SHRCLS','PRC','SHROUT','CFACPR','CFACSHR'], axis=1, inplace = True)
queries leading to this page
clean crsp pandasclean crsp pandas