Graduate Advisor
April 27th, 2018
LoGal: galaxy within 30 Mpc
GClib: 662,772 GCs over 7,909 Galaxy
GClib models:
GClib model diversity:
BBHlib: extracted BBHs from 3,240 GC simulations
BBHdata: 17,883,760 BBHs with host galaxy info. + host GC info. + BBH info.
Merger event rate (6.1.3)
Localizationable BBHs: 14
PGC Dist M1 M2 Seperation Ecc
1628 0.787 3.158000 10.250000 0.807000 0.000000
1628 0.787 12.728000 19.122000 0.157281 0.852028
616 2.188 3.158000 10.250000 0.807000 0.000000
1628 0.787 3.158000 10.250000 0.807000 0.000000
1628 0.787 3.158000 10.250000 0.807000 0.000000
1628 0.787 3.158000 10.250000 0.807000 0.000000
1628 0.787 20.632000 137.730000 0.178258 0.477929
1628 0.787 21.424999 25.615999 0.165200 0.750944
1628 0.787 28.433001 26.503000 0.175125 0.813286
1628 0.787 3.158000 10.250000 0.807000 0.000000
131228 0.762 3.158000 10.250000 0.807000 0.000000
1628 0.787 3.158000 10.250000 0.807000 0.000000
1628 0.787 5.319600 10.276000 0.974700 0.000000
1627 0.813 5.319600 10.276000 0.974700 0.000000
object
due to ~
symbol => impute ~
as NaN and parse data type to float[=] check duplicates: explain pandas table, duplicated galaxies with Dist > 30 Mpc => drop
extract local galaxies within 30 Mpc and 3D view
spatial distribution and edge galaxies
check interesting features
Description
1- 7 I7 --- PGC [2,4715229]? Identifier from HYPERLEDA
(empty for globular clusters)
9- 36 A28 --- Name Common name of galaxy or globular
38- 46 F9.5 h RAhour Right ascension (J2000, decimal hours)
48- 55 F8.4 deg DEdeg Declination (J2000)
57- 60 F4.1 --- TT [-9,10]? Morphological type code (1)
62- 66 F5.2 mag Bmag ? Apparent blue magnitude
68- 74 F7.3 arcmin a ? Major diameter (arcmins)
76- 82 F7.3 arcmin e_a ? Error in major diameter (arcmins)
84- 90 F7.3 arcmin b ? Minor diameter (arcmins)
92- 98 F7.3 arcmin e_b ? Error in minor diameter (arcmins)
100-104 F5.3 --- b/a [0,1]? Ratio of minor to major diameters
106-110 F5.3 --- e_b/a ? Error on ratio of minor to major diameters
112-116 F5.1 deg PA [0,180]? Position angle of galaxy
(degrees from north through east)
118-123 F6.2 mag BMAG ? Absolute blue magnitude
125-131 F7.2 Mpc Dist ? Distance (Mpc)
133-138 F6.2 Mpc e_Dist ? error on Distance (Mpc)
140-143 F4.2 mag e_Bmag ? error on Apparent blue magnitude
145-148 F4.2 mag e_BMAG ? error on Absolute blue magnitude
# 1. load GWGC data
GWGC=pd.read_csv(path_GWGC, delim_whitespace=True)
for key, n in Counter(GWGC.dtypes).most_common():
print("'%s' (count %d): \n - %s" % (key, n, '\n - '.join(x for x in GWGC.columns[GWGC.dtypes==key])))
'object' (count 19): - Name - Type - App_Mag_B - err_App_Mag_B - Abs_Mag_B - err_Abs_Mag_B - App_Mag_I - err_App_Mag_I - Abs_Mag_I - err_Abs_Mag_I - App_Mag_K - err_App_Mag_K - Maj_Diam(a) - err_Maj_Diam - Min_Diam(b) - err_Min_Diam - b/a - err_b/a - PA 'float64' (count 4): - RA - Dec - Dist - err_Dist 'int64' (count 1): - PGC
# 1-. convert to numeric, impute '~' as NaN
for key in GWGC:
if key != 'Name':
GWGC[key]=pd.to_numeric(GWGC[key], errors='coerce')
# 2. check duplicates, duplicated galaxies with Dist > 30 Mpc
display(GWGC[GWGC.Name.duplicated(keep=False)] \
.loc[: ,['Name','Dist','err_Dist','RA','Dec','Type','App_Mag_B']])
Name | Dist | err_Dist | RA | Dec | Type | App_Mag_B | |
---|---|---|---|---|---|---|---|
63638 | PGC138606 | 34.347 | 5.152 | 3.18005 | 61.105202 | 5.0 | 16.32 |
68443 | PGC166081 | 72.361 | 15.919 | 4.09247 | 71.469597 | 7.0 | 17.47 |
68880 | PGC166081 | 24.847 | 3.727 | 4.09246 | 71.469470 | NaN | 17.39 |
130474 | PGC138606 | 35.569 | 7.825 | 3.17891 | 61.113250 | 1.2 | 15.51 |
178917 | 6dFJ1705055-200214 | 112.653 | 24.784 | 17.08485 | -20.037250 | NaN | NaN |
178919 | 6dFJ1704153-203840 | 117.694 | 25.893 | 17.07090 | -20.644350 | NaN | NaN |
179298 | 6dFJ1705055-200214 | 112.653 | 24.784 | 17.08485 | -20.037250 | NaN | NaN |
179620 | 6dFJ1704153-203840 | 116.194 | 25.563 | 17.07090 | -20.644350 | NaN | NaN |
[x] check duplicates
[=] extract local galaxies within 30 Mpc and 3D view: 8,946 galaxies out of 183,327, present in 3D plot
spatial distribution and edge galaxies: present in distplot
check interesting features
# 3. extract local galaxies within 30 Mpc and 3D view [slow]
GWGC['Mcat'] = 'Unknown'
GWGC['iMType'] = 0
GWGC.Mcat[GWGC.Type.between(-4,-3)] = 'E/S0'
GWGC.iMType[GWGC.Type.between(-4,-3)] = 6
GWGC.Mcat[GWGC.Type.between(0,11)] = 'S/Irr'
GWGC.iMType[GWGC.Type.between(0,11)] = 1
GWGC.Mcat[GWGC.Type.between(-6,-4)] = 'E'
GWGC.iMType[GWGC.Type.between(-6,-4)] = 2
GWGC.Mcat[GWGC.Type.between(-3,0)] = 'S0'
GWGC.iMType[GWGC.Type.between(-3,0)] = 3
LoGal = GWGC[GWGC.Dist<30]
if not flag_skip:
fig = figure(figsize=sfig_size)
ax = fig.add_subplot(111, projection='3d')
ax.scatter3D(GWGC.Dist*np.cos((GWGC.RA)/12*pi)*np.sin((90-GWGC.Dec)/180*pi),
GWGC.Dist*np.sin((GWGC.RA)/12*pi)*np.sin((90-GWGC.Dec)/180*pi),
GWGC.Dist*np.cos((90-GWGC.Dec)/180*pi),
alpha=0.1,s=2,edgecolors=None) # all GWGCs
ax.scatter3D(LoGal.Dist*np.cos((LoGal.RA)/12*pi)*np.sin((90-LoGal.Dec)/180*pi),
LoGal.Dist*np.sin((LoGal.RA)/12*pi)*np.sin((90-LoGal.Dec)/180*pi),
LoGal.Dist*np.cos((90-LoGal.Dec)/180*pi),alpha=0.1,c='g',s=2,edgecolors=None) # all LoGals
ax.view_init(20, 90)
r = [-100, 100]
for s, e in combinations(np.array(list(product(r, r, r))), 2):
if np.sum(np.abs(s-e)) == r[1]-r[0]:
ax.plot3D(*zip(s, e), color="k") # box to show orientation
ax.set_axis_off()
ax.set_xlim(left=-100,right=100)
ax.set_ylim(bottom=-100,top=100)
ax.set_zlim(-100,100)
ax.set_title('%d galaxies out of the %d galaxies are located in the Local Universe (within 30 Mpc)'
% (len(LoGal), len(GWGC)))
if flag_svfg:
fig.savefig(path.join(path_fig, 'GWGC3D.jpeg'),**params_svfg)
else:
ax = figure(figsize=sfig_size).add_subplot(111)
ax.imshow(imread(path.join(path_fig, 'GWGC3D.jpeg')))
ax.grid(False)
[x] check duplicates
[x] extract local galaxies within 30 Mpc and 3D view
[=] spatial distribution and edge galaxies: present in distplot
be cautious with bin in histgrams
[-4, -3]: E/S0 => GCpG.iMType==6
[0, 11]: S/Irr => GCpG.iMType==1
[-6, -4): E => GCpG.iMType==2
(-3, 0): S0 => GCpG.iMType==3
NaN: Unknown, 0
# 4. edge galaxies with morphology type, be cautious with bins in histograms
fig, ax = subplots(figsize=fig_size)
sns.distplot(GWGC.Dist,color='g',ax=ax) # NO. of galaxies at different Dist
ax.set_xlim([0, 180])
if 'GWGC_DT' not in locals():
GWGC_DT = GWGC.loc[:,['Dist','err_Dist','Mcat']].dropna()
GWGC_DT['Dcat'] = 0
GWGC_DT.Dcat[(GWGC_DT.Dist>30) & (GWGC_DT.Dist-GWGC_DT.err_Dist<30)] = -1
GWGC_DT.Dcat[(GWGC_DT.Dist<30) & (GWGC_DT.Dist+GWGC_DT.err_Dist>30)] = 1
sub_axes = axes([0.16, 0.56, 0.25, 0.25])
sub_axes.set_yticklabels([])
sub_axes.set_xlim([24, 38])
sub_axes2=sub_axes.twinx()
sub_axes2.set_yticklabels([])
sub_axes.grid(None)
sub_axes2.grid(None)
sns.distplot(GWGC_DT.Dist[GWGC_DT.Dcat==-1],kde=False,ax=sub_axes)
sns.distplot(GWGC_DT.Dist[GWGC_DT.Dcat==1],kde=False,ax=sub_axes)
sns.distplot(GWGC_DT.Dist[GWGC_DT.Dcat!=0],hist=False,ax=sub_axes2,color='y')
sub_axes = axes([0.6, 0.25, 0.25, 0.25])
sub_axes.set_xlabel('Morphology Type')
sub_axes.set_yticklabels([])
sub_axes.hist([GWGC_DT.Mcat[(GWGC_DT.Dcat==-1) & (GWGC_DT.Mcat!='Unknown')].values,
GWGC_DT.Mcat[(GWGC_DT.Dcat==1) & (GWGC_DT.Mcat!='Unknown')].values],
histtype='bar')
if flag_svfg:
savefig(path.join(path_fig,'Distance.jpeg'),**params_svfg)
/opt/conda/lib/python3.6/site-packages/matplotlib/axes/_axes.py:6462: UserWarning: The 'normed' kwarg is deprecated, and has been replaced by the 'density' kwarg. warnings.warn("The 'normed' kwarg is deprecated, and has been " /opt/conda/lib/python3.6/site-packages/matplotlib/axes/_axes.py:6462: UserWarning: The 'normed' kwarg is deprecated, and has been replaced by the 'density' kwarg. warnings.warn("The 'normed' kwarg is deprecated, and has been " /opt/conda/lib/python3.6/site-packages/matplotlib/axes/_axes.py:6462: UserWarning: The 'normed' kwarg is deprecated, and has been replaced by the 'density' kwarg. warnings.warn("The 'normed' kwarg is deprecated, and has been "
[x] check duplicates
[x] extract local galaxies within 30 Mpc and 3D view
[x] spatial distribution and edge galaxies
[=] check interesting features: missing data
# 5-A. statistics
display(LoGal.loc[:,['Name','RA','Dec','Dist','Type','Abs_Mag_B','Abs_Mag_I']] \
.describe(include='all').loc[['count','unique','min','max','mean','std'],:])
Name | RA | Dec | Dist | Type | Abs_Mag_B | Abs_Mag_I | |
---|---|---|---|---|---|---|---|
count | 8946 | 8946.000000 | 8946.000000 | 8946.000000 | 5979.000000 | 7877.000000 | 5606.000000 |
unique | 8946 | NaN | NaN | NaN | NaN | NaN | NaN |
min | NaN | 0.005940 | -89.334590 | 0.010000 | -5.000000 | -22.590000 | -25.520000 |
max | NaN | 23.996680 | 86.131800 | 29.986000 | 10.000000 | -4.970000 | -1.680000 |
mean | NaN | 10.973250 | 3.455383 | 19.011495 | 4.524034 | -16.210734 | -17.158776 |
std | NaN | 5.277991 | 34.889882 | 6.616150 | 4.707900 | 2.451796 | 2.627928 |
# 5-B. visualization
img = msno.matrix(LoGal.loc[:, ['Name','RA','Dec','Dist','Type','Abs_Mag_B','Abs_Mag_I']], inline=False)
if flag_svfg:
img.savefig(path.join(path_fig, 'GWGC_NaN.jpeg'), **params_svfg)
check duplicates and NaNs
check interested features
Description
0 1- 14 A14 --- Name Galaxy identifier
16- 27 A12 --- OName Other galaxy identifier
29- 38 F10.7 h RAhour ? Right Ascension in decimal hours (J2000)
40- 50 F11.7 deg DEdeg ? Declination in decimal degrees (J2000)
4 52- 56 A5 --- MType Morphology type
5 58- 63 F6.2 Mpc Dist ? NED distance; see Section 2
65- 69 F5.2 Mpc e_Dist ? Uncertainty in Dist
71- 83 A13 --- n_Dist Method used to determine Dist
85- 89 F5.3 mag Av NED foreground V band extinction
9 91- 97 F7.3 mag VMag [-25/-11]? Absolute visual magnitude (1)
99-104 F6.3 mag e_VMag [0.2/1.5]? Uncertainty in VMag
11 106-115 F10.3 mag KMag [-28/-14]? Absolute K band magnitude (2)
117-125 F9.3 mag e_KMag [0.1/0.3]? Uncertainty in KMag
127 A1 --- l_Ngc [>] Limit flag on Ngc
14 128-134 F7.1 --- Ngc [0/32500] Number of Globular Clusters (NGC)
136-142 F7.1 --- e_Ngc [0/20000] Uncertainty in Ngc
144-152 A9 --- r_Ngc Reference(s) for Ngc (see refs.dat file)
17 154-159 F6.2 km/s sigma [10/414]? Stellar velocity dispersion (3)
161-165 F5.2 km/s e_sigma ? Uncertainty in sigma
19 167-171 F5.2 kpc Reff [0.1/55]? Effective radius (4)
173-177 F5.2 kpc e_Reff ? Uncertainty in Reff
21 179-184 F6.3 [Msun] logMd [7.6/12.8]? Log of dynamical mass
186-191 F6.3 [Msun] e_logMd ? Uncertainty in logMd
23 193-197 F5.2 [Msun] logMGC ? Log of total globular cluster mass (5)
199-203 F5.2 [Msun] e_logMGC ? Uncertainty in logMGC
25 205-209 F5.2 [Msun] logMB ? Log of central supermassive black hole mass(6)
211-215 F5.2 [Msun] E_logMB ? Upper uncertainty in logMB
217-221 F5.2 [Msun] e_logMB ? Lower uncertainty in logMB
MType: str
=> to numeric, & parse in Harris manner, present in barplot[=] check duplicates and NaNs: explain pandas table, 418 galaxy enteries
[VMag, KMag, Dist]
, MilkyWay => drop[VMag, KMag]
, A1689-BCG, 770 Mpc => droplogMGC
, 3 in Ngc
=> data is dirty, check in 4-A-bcheck interested features
# 1. load GCpG data
GCpG=pd.read_csv(path_GCpG)
# 2. MType to numeric, in Harris manner
GCpG['iMType'] = ''
GCpG['Mcat'] = ''
# E -> 2: 226(+18)
GCpG.iMType[GCpG.MType.str.contains(r'E')] = 2
GCpG.Mcat[GCpG.MType.str.contains(r'E')] = 'E'
# S0 -> 3: 75(+18)
GCpG.iMType[GCpG.MType.str.contains(r'S0')] = 3
GCpG.Mcat[GCpG.MType.str.contains(r'S0')] = 'S0'
# S/Irr -> 1: 103
GCpG.iMType[-(GCpG.MType.str.contains(r'E') | GCpG.MType.str.contains(r'S0'))] = 1
GCpG.Mcat[-(GCpG.MType.str.contains(r'E') | GCpG.MType.str.contains(r'S0'))] = 'S/Irr'
# E/S0 -> 6: 18
GCpG.iMType[GCpG.MType.str.contains(r'E') & GCpG.MType.str.contains(r'S0')] = 6
GCpG.Mcat[GCpG.MType.str.contains(r'E') & GCpG.MType.str.contains(r'S0')] = 'E/S0'
fig = figure(figsize=fig_size)
ax = fig.add_subplot(111)
GCpG.loc[:,['Name','MType','Mcat']].groupby(['Mcat','MType']).count() \
.unstack().plot(kind='bar',stacked=True,legend=False,rot=0, ax=ax)
text(0.8, 150,'%10s' % GCpG.loc[:,['Name','Mcat']].groupby('Mcat').count() \
.to_string().replace('Name','count'), **params_font)
text(1.8, 150, '%6s' % GCpG[GCpG.Mcat.isin(['S0','E/S0'])].loc[:,['Name','MType','Mcat']] \
.groupby(['Mcat','MType']).count().to_string().replace('Name','count').replace(' ', '%2s' % ' '), **params_font)
xlabel('Morphology Type')
if flag_svfg:
savefig(path.join(path_fig, 'MType.jpeg'),**params_svfg)
print('Our catalog contains 248 ellipticals, 93 S0\'s,'+
' and 81 spirals or irregulars, (Harris et. al. 2013).')
Our catalog contains 248 ellipticals, 93 S0's, and 81 spirals or irregulars, (Harris et. al. 2013).
# 3. check duplicates and NaNs
display(GCpG.reindex([0,356,227,228,272,273,138]).loc[:,
['Name','RAhour','DEdeg','Dist','e_Dist','VMag','e_VMag','KMag',
'logMGC','Ngc','r_Ngc','sigma','Reff','logMd','logMB']])
if 0 in GCpG.index:
GCpG.drop([0,356,228,272],inplace=True)
Name | RAhour | DEdeg | Dist | e_Dist | VMag | e_VMag | KMag | logMGC | Ngc | r_Ngc | sigma | Reff | logMd | logMB | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | MilkyWay | NaN | NaN | NaN | NaN | -21.30 | 0.3 | NaN | 7.66 | 160.0 | 44 | 105.0 | 0.70 | 9.856 | 6.61 |
356 | A1689-BCG | 13.191528 | -1.338056 | 790.00 | 40.00 | NaN | NaN | NaN | 10.10 | 30000.0 | 66b | NaN | NaN | NaN | NaN |
227 | NGC4417 | 12.447387 | 9.584243 | 16.03 | 0.53 | -20.00 | 0.2 | -22.860 | 7.24 | 72.0 | 77 | 135.2 | 1.22 | 10.318 | NaN |
228 | NGC4417 | 12.447387 | 9.584243 | 16.03 | 0.53 | -20.00 | 0.2 | -22.860 | 7.24 | 72.0 | 77 | 135.2 | 1.22 | 10.318 | NaN |
272 | VCC-1386 | 12.530939 | 12.656659 | 16.00 | 2.00 | -17.34 | 0.2 | NaN | 6.67 | 26.6 | 67 | 20.5 | 1.80 | 8.848 | NaN |
273 | VCC-1386 | 12.530939 | 12.656659 | 16.10 | 1.13 | -16.90 | 0.2 | NaN | 6.64 | 26.0 | 24 | NaN | NaN | NaN | NaN |
138 | NGC2915 | 9.436528 | -76.626389 | 4.06 | 0.20 | -16.12 | 0.2 | -18.298 | NaN | 3.0 | 66a | NaN | 0.31 | NaN | NaN |
[x] check duplicates and NaNs
[=] check interested features:
[=] statistics, present in 2 pandas tables
var(Ngc(6:0))
significantly larger => add logNgc(6:-inf)
Ngc, e_Ngc(0:3), logMGC(1:NaN), e_logMGC(2:0)
: Ngc
derived from logMGC
=> focus on Ngc/logNgc(412/418)
Ngc == 0:
[7,51,87,88,271,304] => Ngc = 0.5 e_Ngc, avoid -inf
temporatelye_Ngc == 0:
[1,28,138] => e_Ngc = 0.5 Ngc, avoid -inf
temporatelylogMGC, e_logMGC == NaN:
[138] => linear interpolate NaN logMGC
from logNgc(412)
)e_logMGC ==0:
[1,28] => leave alonevisualization
# 4-A-a. statistics
display(GCpG.loc[:,['Name','VMag','KMag','Ngc','e_Ngc','r_Ngc',
'sigma','Reff','logMd','logMGC','Dist','logMB']] \
.describe(include='all').loc[['count','unique','min','max','mean','std'],:])
Name | VMag | KMag | Ngc | e_Ngc | r_Ngc | sigma | Reff | logMd | logMGC | Dist | logMB | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
count | 418 | 418.000000 | 344.000000 | 418.000000 | 418.000000 | 418 | 271.000000 | 340.000000 | 253.000000 | 417.000000 | 418.000000 | 64.000000 |
unique | 418 | NaN | NaN | NaN | NaN | 106 | NaN | NaN | NaN | NaN | NaN | NaN |
min | NaN | -24.190000 | -27.080000 | 0.000000 | 0.000000 | NaN | 10.400000 | 0.130000 | 7.693000 | 4.880000 | 0.030000 | 0.000000 |
max | NaN | -11.170000 | -14.510000 | 32500.000000 | 13000.000000 | NaN | 414.000000 | 55.000000 | 12.726000 | 10.120000 | 375.000000 | 10.320000 |
mean | NaN | -19.127943 | -22.583840 | 1512.699761 | 351.961962 | NaN | 168.750923 | 4.287588 | 10.727692 | 7.369305 | 34.758971 | 5.933750 |
std | NaN | 2.799122 | 2.659367 | 3957.436127 | 1075.182192 | NaN | 93.365634 | 6.092928 | 0.969581 | 1.229806 | 46.445852 | 3.953719 |
# 4-A-b. i: take logarithms -> logNgc; ii: fit NaN in logMGC and e_logMGC
display(GCpG.loc[(GCpG.loc[:,['Ngc','e_Ngc','logMGC','e_logMGC']]==0).sum(axis=1)==True,
['Name','Ngc','e_Ngc','logMGC','e_logMGC']])
if 'has_GC' not in locals(): # i
has_GC = (GCpG.Ngc > 0) # mark for logMGC interpolation
GCpG.Ngc[~has_GC] = 0.5 * GCpG.e_Ngc[~has_GC]
GCpG.e_Ngc[GCpG.e_Ngc==0] = 0.5 * GCpG.Ngc[GCpG.e_Ngc==0]
if 'logNgc' not in GCpG.columns:
GCpG['logNgc'] = GCpG.Ngc.apply(log10)
if 'nn_MGC' not in locals(): # ii
nn_MGC = GCpG.logMGC.notna() # update 138: logMGC NaN from Ngc/logNgc
LRsk = linear_model.LinearRegression()
LRsk.fit(GCpG.logNgc[has_GC & nn_MGC].values.reshape(-1,1),
GCpG.logMGC[has_GC & nn_MGC].values.reshape(-1,1))
GCpG.logMGC[~nn_MGC] = LRsk.predict(GCpG.logNgc[~nn_MGC].values.reshape(-1,1))
LRsk.fit(GCpG.e_Ngc[has_GC & nn_MGC].apply(log10).values.reshape(-1,1),
GCpG.e_logMGC[has_GC & nn_MGC].values.reshape(-1,1))
GCpG.e_logMGC[~nn_MGC] = LRsk.predict(GCpG.e_Ngc[~nn_MGC].apply(log10).values.reshape(-1,1))
Name | Ngc | e_Ngc | logMGC | e_logMGC | |
---|---|---|---|---|---|
7 | NGC221 | 0.0 | 1.0 | 6.26 | 0.10 |
51 | FCC-0059 | 0.0 | 4.0 | 6.21 | 0.19 |
87 | FCC-242 | 0.0 | 3.3 | 5.51 | 0.30 |
88 | FCC-246 | 0.0 | 3.4 | 5.51 | 0.30 |
138 | NGC2915 | 3.0 | 0.0 | NaN | NaN |
271 | VCC-1363 | 0.0 | 3.0 | 6.19 | 0.24 |
304 | VCC-1714 | 0.0 | 4.5 | 6.19 | 0.16 |
[x] check duplicates and NaNs
[=] check interested features
[x] statistics, present in 2 pandas tables
[=] visualization
logNgc
: present KDE in distplot NaN
data: present in msnoplot# 4-B. visualization
fig = figure(figsize=fig_size)
ax = fig.add_subplot(211)
GCpG.logNgc.plot('kde',title='logNgc',ax=ax)
ax2 = fig.add_subplot(212)
GCpG.loc[:,GCpG.columns[GCpG.dtypes==float]].describe().loc['count',:].plot('bar', ax=ax2)
<matplotlib.axes._subplots.AxesSubplot at 0x7f34ba0885c0>
img = msno.matrix(GCpG.loc[:, ['Name','VMag','KMag','Ngc','r_Ngc',
'sigma','Reff','logMd','Dist','logMB','logNgc']], inline=False)
if flag_svfg:
img.savefig(path.join(path_fig, 'GCpG_NaN.jpeg'), **params_svfg)
[=] explore correlations for data type in float
: present in pairplots with heatmap
'Dist', 'e_Dist', 'Av', 'VMag', 'e_VMag', 'KMag', 'e_KMag', 'Ngc', 'e_Ngc', 'sigma', 'e_sigma', 'Reff', 'e_Reff', 'logMd', 'e_logMd', 'logMGC', 'e_logMGC','logMB', 'E_logMB', 'e_logMB', 'logNgc'
xxx & e_xxx
: intuitive, not interestingAv & Dist
: Dist
derived from NED Av
=> focus on Dist
(more intuitive)VMag & KMag(74:NaN)
: intuitive, LR in 2-ADist & Ngc(6:0) & Reff(78:NaN)
VMag & KMag(74:NaN) & sigma(147:NaN) & logMd(165:NaN) & logMGC(1:NaN) & logNgc(6:-inf)
check correlated pairs [and interpolate values]
correlation ranking
# 1. explore correlations [slow]
def pp_heat(df, cor = 0.5, figname = 'pairplots.jpeg'):
'''
Input
df: pandas dataframe
Output
corr: df.corr()
pairplot with heatmap
'''
with sns.axes_style("white"):
pp = sns.pairplot(df.fillna(0), diag_kind='kde', markers='x', plot_kws={'s':15,'alpha':0.2})
corr = df.corr()
for i, j in zip(*triu_indices_from(pp.axes, 1)):
pp.axes[i, j].set_visible(False)
# pp.axes[i, j].figure.text((j+0.5)/len(corr),(len(corr)-i-0.5)/len(corr),
# str(round(corr.values[i, j],3)), **params_font)
for i, j in zip(*tril_indices_from(pp.axes, -1)):
if abs(corr.values[i, j]) < cor or isnan(corr.values[i, j]):
sns.despine(ax=pp.axes[i, j],top=True,right=True,left=True,bottom=True)
else:
sns.despine(ax=pp.axes[i, j],top=False,right=False,left=False,bottom=False)
for pos in ['top','right','left','bottom']:
pp.axes[i, j].spines[pos].set_edgecolor('gray')
pp.axes[i, j].figure.text((j+0.7)/len(corr),(len(corr)-i-0.3)/len(corr),
str(round(corr.values[i, j],3)), **params_font)
ax = pp.axes[0,0].figure.add_subplot(222)
sns.heatmap(corr, square=True, cmap=cmap, alpha=0.8, ax=ax)
savefig(path.join(path_fig,figname), **params_svfg)
return 0
if not flag_skip:
pp_heat(GCpG.loc[:,GCpG.columns[GCpG.dtypes==float][2:]], 0.5, 'pp_GCpG.jpeg')
else:
ax = figure(figsize=sfig_size).add_subplot(111)
ax.imshow(imread(path.join(path_fig, 'pp_GCpG.jpeg')))
ax.grid(False)
[x] explore correlations for data type in float
[=] check correlated pairs [and interpolate values]
[=] VMag(418) & KMag(344+74)
: LR interpolate => KMag(74:NaN)
, present in errorbar & OLS LR summary
Dist & Ngc(6:0) & Reff(78:NaN)
VMag(418)/KMag(344) & logMd(253)/sigma(271) & logMGC(417)/logNgc(412)
correlation ranking
# 2. check correlated pairs [and interpolate values]
# 2-A. `VMag` & `KMag`: linear regression
if 'nn_KMag' not in locals():
nn_KMag = GCpG.KMag.notna()
LRsk = linear_model.LinearRegression()
LRsk.fit(GCpG.VMag[nn_KMag].values.reshape(-1,1), GCpG.KMag[nn_KMag].values.reshape(-1,1))
GCpG.KMag[~nn_KMag] = LRsk.predict(GCpG.VMag[~nn_KMag].values.reshape(-1,1))
LRsk.fit(GCpG.e_VMag[nn_KMag].values.reshape(-1,1),
GCpG.e_KMag[nn_KMag].values.reshape(-1,1))
GCpG.e_KMag[~nn_KMag] = LRsk.predict(GCpG.e_VMag[~nn_KMag].values.reshape(-1,1))
LRols = sm.OLS(GCpG.KMag.values, GCpG.VMag.values).fit() # OLS(y, X)
print(LRols.summary2())
Results: Ordinary least squares ================================================================== Model: OLS Adj. R-squared: 0.999 Dependent Variable: y AIC: 803.0247 Date: 2018-06-17 22:09 BIC: 807.0602 No. Observations: 418 Log-Likelihood: -400.51 Df Model: 1 F-statistic: 5.005e+05 Df Residuals: 417 Prob (F-statistic): 0.00 R-squared: 0.999 Scale: 0.39886 --------------------------------------------------------------------- Coef. Std.Err. t P>|t| [0.025 0.975] --------------------------------------------------------------------- x1 1.1304 0.0016 707.4353 0.0000 1.1273 1.1336 ------------------------------------------------------------------ Omnibus: 176.239 Durbin-Watson: 1.879 Prob(Omnibus): 0.000 Jarque-Bera (JB): 1964.668 Skew: 1.479 Prob(JB): 0.000 Kurtosis: 13.201 Condition No.: 1 ==================================================================
fig = figure(figsize=fig_size)
ax = fig.add_subplot(111)
ax.errorbar(GCpG.VMag[nn_KMag], GCpG.KMag[nn_KMag],
xerr=GCpG.e_VMag[nn_KMag], yerr=GCpG.e_KMag[nn_KMag], fmt='g+', **params_errbar)
ax.errorbar(GCpG.VMag[~nn_KMag], GCpG.KMag[~nn_KMag],
xerr=GCpG.e_VMag[~nn_KMag], yerr=GCpG.e_KMag[~nn_KMag], fmt='rx', **params_errbar)
ax.invert_xaxis()
ax.invert_yaxis()
ax.set_xlabel('VMag')
ax.set_ylabel('KMag')
legend(['existing KMag vs VMag: %d/%d' % (sum(nn_KMag),len(nn_KMag)),
'interpolated KMag vs VMag: %d/%d' % (sum(~nn_KMag),len(nn_KMag))])
if flag_svfg:
savefig(path.join(path_fig,'KMag_VMag.jpeg'), **params_svfg)
float
[=] check correlated pairs [and interpolate values]
[x] VMag(418) & KMag(344+74)
[=]Dist & Ngc(6:0) & Reff(78:NaN)
: no significant correlation => leave alone, present in pairplot
VMag(418)/KMag(344) & logMd(253)/sigma(271) & logMGC(417)/logNgc(412)
correlation ranking
# 2-B. `Dist` & `Ngc` & `Reff`: no significant correlations
# pp = sns.pairplot(GCpG, vars= ['Dist','Ngc','Reff'], hue='Mcat', diag_kind= 'kde',
# plot_kws=params_hex, diag_kws={'lw':1}, size=3.2)
# for i, j in zip(*triu_indices_from(pp.axes, 1)):
# pp.axes[i, j].set_visible(False)
# upgrade to PairGrid https://seaborn.pydata.org/generated/seaborn.PairGrid.html
pp = sns.PairGrid(GCpG, vars= ['Dist','Ngc','Reff'], hue='Mcat',diag_sharey=False,size=3.2)
pp.map_lower(scatter, **params_hex)
pp.map_upper(sns.residplot)#, ci=97, scatter_kws={'s':15}, line_kws={'lw':1})
pp.map_diag(sns.kdeplot, lw=1)
pp.add_legend()
if flag_svfg:
savefig(path.join(path_fig,'pp_DNR.jpeg'), **params_svfg)
float
[x] check correlated pairs [and interpolate values]
VMag(418) & KMag(344+74)
[x] Dist & Ngc(6:0) & Reff(78:NaN)
[=] VMag(418)/KMag(344) & logMd(253)/sigma(271) & logMGC(417)/logNgc(412)
[=] logNgc(6:-inf) & logMGC(1:NaN)
: Ngc(6:0)
deduct from logMGC
(Harris et al. 2011), present in errorbar & OLS LR summary
logNgc(412-1)
LR interpolate logMGC
(1:NaN)
logMGC(418-1-6)
LR interpolate Ngc/logNgc
(6:0/-inf)
VMag/KMag (418/344+74) & logNgc/logMGC (412/417)
logMd(253) & sigma(271) & logNgc/logMGC (412/417)
correlation ranking
# 2-C-a. `logNgc` & `logMGC`: linear regression, interpolate logNgc from logMGC
LRsk = linear_model.LinearRegression()
LRsk.fit(GCpG.logMGC[has_GC].values.reshape(-1,1),
GCpG.logNgc[has_GC].values.reshape(-1,1))
GCpG.logNgc[~has_GC] = LRsk.predict(GCpG.logMGC[~has_GC].values.reshape(-1,1))
LRsk.fit(GCpG.e_logMGC[has_GC].values.reshape(-1,1),
GCpG.e_Ngc[has_GC].apply(log10).values.reshape(-1,1))
GCpG.e_Ngc[~has_GC] = 10**LRsk.predict(GCpG.e_logMGC[~has_GC].values.reshape(-1,1))
GCpG.Ngc[~has_GC] = 10**GCpG.logNgc[~has_GC]
LRols = sm.OLS(GCpG.logNgc.values, GCpG.logMGC.values).fit() # OLS(y, X)
print(LRols.summary2())
Results: Ordinary least squares ================================================================== Model: OLS Adj. R-squared: 0.892 Dependent Variable: y AIC: 954.0297 Date: 2018-06-17 22:09 BIC: 958.0651 No. Observations: 418 Log-Likelihood: -476.01 Df Model: 1 F-statistic: 3437. Df Residuals: 417 Prob (F-statistic): 1.75e-203 R-squared: 0.892 Scale: 0.57241 --------------------------------------------------------------------- Coef. Std.Err. t P>|t| [0.025 0.975] --------------------------------------------------------------------- x1 0.2906 0.0050 58.6292 0.0000 0.2808 0.3003 ------------------------------------------------------------------ Omnibus: 40.119 Durbin-Watson: 1.430 Prob(Omnibus): 0.000 Jarque-Bera (JB): 18.367 Skew: 0.316 Prob(JB): 0.000 Kurtosis: 2.191 Condition No.: 1 ==================================================================
fig = figure(figsize=fig_size)
ax = fig.add_subplot(111)
ax.set_xlim([-1,6])
ax.set_ylim([4.5,10])
ax.errorbar(GCpG.logNgc[has_GC], GCpG.logMGC[has_GC],
xerr=log10(GCpG.e_Ngc[has_GC]), yerr=GCpG.e_logMGC[has_GC],
fmt='g+', **params_errbar)
ax.errorbar(GCpG.logNgc[~has_GC], GCpG.logMGC[~has_GC],
xerr=log10(GCpG.e_Ngc[~has_GC]), yerr=GCpG.e_logMGC[~has_GC],
fmt='rx', **params_errbar)
legend(['existing logNgc vs logMGC','interpolated logNgc vs logMGC'], loc=4)
ax.set_xlabel('logNgc')
ax.set_ylabel('logMGC')
if flag_svfg:
savefig(path.join(path_fig,'Ngc_MGC.jpeg'), **params_svfg)
[x] explore correlations for data type in float
[x] check correlated pairs [and interpolate values]
[x] VMag(418) & KMag(344+74)
[x] Dist & Ngc(412+6) & Reff(78:NaN)
[=] VMag(418)/KMag(344) & logMd(253)/sigma(271) & logMGC(417)/logNgc(412)
[x] logNgc(412+6) & logMGC(417+1)
[=] VMag/KMag (418/344+74) & logNgc/logMGC (412/417)
: all complete
VMag
(original) vs logNgc
(more intuitive) logMd(253) & sigma(271) & logNgc/logMGC (412/417)
correlation ranking
# 2-C-b. `VMag(KMag) & logNgc(logMGC)`
pp = sns.pairplot(GCpG, vars=['VMag','KMag','logNgc','logMGC'], hue='Mcat',
plot_kws=params_scatter, diag_kind='kde', diag_kws={'lw':1})
for i, j in zip(*triu_indices_from(pp.axes, 1)):
pp.axes[i, j].set_visible(False)
pp.axes[0,0].invert_xaxis()
pp.axes[0,1].invert_xaxis()
pp.axes[1,0].invert_yaxis()
if flag_svfg:
savefig(path.join(path_fig,'pp_Luminosity.jpeg'), **params_svfg)
[x] explore correlations for data type in float
[x] check correlated pairs [and interpolate values]
[x] VMag(418) & KMag(344+74)
[x] Dist & Ngc(6:0) & Reff(78:NaN)
[=] VMag(418)/KMag(344) & logMd(253)/sigma(271) & logMGC(417+1)/logNgc(412+6)
[x] logNgc(412+6) & logMGC(417+1)
[x] VMag/KMag (418/344+74) & logNgc/logMGC (412+6/417+1)
[=] logMd(253) & sigma(271) & logNgc/logMGC (412+6/417+1)
logMd(253)
= f(sigma(271)
,Reff(340)
) vs logNgc(412+6)
:logMd(253) limiting termcorrelation ranking
# 2-C-c. `logMd & sigma & logNgc(logMGC)`
pp = sns.pairplot(GCpG, vars=['logMd','sigma','logNgc','logMGC'], hue='Mcat',
plot_kws=params_scatter, diag_kind='kde', diag_kws={'lw':1})
for i, j in zip(*triu_indices_from(pp.axes, 1)):
pp.axes[i, j].set_visible(False)
if flag_svfg:
savefig(path.join(path_fig,'pp_Dynamical.jpeg'), **params_svfg)
[x] explore correlations for data type in float
[x] check correlated pairs [and interpolate values]
[x] VMag & KMag
[x] Dist & Ngc & Reff
[x] VMag(KMag) & logMd (sigma) & logMGC(logNgc)
[=] correlation ranking: Random Forest with feature importance, explain in heatmap
VMag/KMag(344+74) vs logNgc(412+6)/logMGC(417+1)
KMag & logMGC > KMag & logNgc > VMag & logNgc
KMag & logNgc > KMag & logMGC > VMag & logMGC
KMag & logNgc > KMag & logMGC > VMag & logMGC
KMag & Ngc > logMd & logMGC > VMag & logMGC
, max correlation score drops# 3. correlation ranking, feature importance with Random Forest
## 3-A. impute median for NaN
## 3-B. impute mean for NaN
## 3-C. impute most frequent for NaN
## 3-D. 'normalize' by data avalibility 253/418
if 'strategies' not in locals():
features = ['Dist', 'e_Dist', 'Av', 'VMag', 'e_VMag', 'KMag', 'e_KMag', 'sigma', 'e_sigma',
'Reff', 'e_Reff', 'logMd', 'e_logMd', 'logMB', 'E_logMB', 'e_logMB']
targets = ['Ngc', 'e_Ngc', 'logNgc', 'logMGC', 'e_logMGC', 'iMType']
norm_feats = ['Dist', 'e_Dist', 'Av', 'VMag', 'e_VMag', 'KMag', 'e_KMag', 'sigma', 'e_sigma',
'Reff', 'e_Reff', 'logMd', 'e_logMd']
strategies = ['median', 'mean', 'most_frequent','norm']
def RF_NaN(df, feats, tags, strategy):
'''
Input
df: pandas dataframe
feats: list of feature colume names
tags: list of target colume names
strategy: str of strategy to deal with NaN
Output
df_rf: dataframe of Random Forest feature importances
'''
df_rf = pd.DataFrame(index=feats)
if strategy == 'norm':
X_rf = df.loc[:, feats].values
else:
X_rf = df.loc[:, feats].values
imp = Imputer(missing_values='NaN', strategy=strategy, axis=0)
X_rf = imp.fit_transform(X_rf)
RFreg = RandomForestRegressor()
for target in tags:
y_rf = df.loc[:, target].values
RFreg.fit(X_rf, y_rf)
df_rf[target] = RFreg.feature_importances_
return df_rf
axs = figure(figsize=sfig_size).subplots(2,2).ravel()
for i in range(len(strategies)):
if strategies[i] == 'norm':
df_rf = RF_NaN(GCpG.loc[:, norm_feats+targets].dropna(), norm_feats, targets, strategies[i])
axs[i].set_title("\'subset\' 253/418")
else:
df_rf = RF_NaN(GCpG, features, targets, strategies[i])
axs[i].set_title(strategies[i])
sns.heatmap(df_rf, cmap=cmap, alpha=0.8, ax=axs[i])
if flag_svfg:
savefig(path.join(path_fig,'RF_heat.jpeg'), **params_svfg)
logMd vs logMGC/logNgc
¶[=] dynamical mass model logMGC/logNgc
$=\mathscr{LR}$(logMd
)
[=] scatter plot with LR: strong correlation, present in scatter + regplot
OLS LR summary
luminosity model $\mathscr{LR}$(VMag/KMag
)
revisit with errorbar
# 1-A. `logMd vs logMGC/logNgc` in scatter
fig = figure(figsize=fig_size)
ax = fig.add_subplot(211)
ax2 = fig.add_subplot(212)
ax.set_xlim([7.5,13])
ax2.set_xlim([7.5,13])
ax2.set_ylim([-0.5, 5.5])
for i in range(len(GCpG.Mcat.unique())):
mask_MT = (GCpG.Mcat == GCpG.Mcat.unique()[i])
ax.scatter(GCpG.logMd[mask_MT], GCpG.logMGC[mask_MT],**markers[i])
sns.regplot(GCpG.logMd[mask_MT], GCpG.logMGC[mask_MT],
scatter=False, ci=None, line_kws={'lw':1}, ax=ax)
ax2.scatter(GCpG.logMd[mask_MT], GCpG.logNgc[mask_MT],**markers[i])
sns.regplot(GCpG.logMd[mask_MT], GCpG.logNgc[mask_MT],
scatter=False, ci=None, line_kws={'lw':1}, ax=ax2)
sns.regplot(GCpG.logMd,GCpG.logMGC, scatter=False, ci=ci, line_kws={'lw':1}, ax=ax)
sns.regplot(GCpG.logMd,GCpG.logNgc, scatter=False, ci=ci, line_kws={'lw':1}, ax=ax2)
ax.legend()
ax2.legend()
if flag_svfg:
savefig(path.join(path_fig,'LR_Mdyn.jpeg'), **params_svfg)
logMd vs logMGC/logNgc
¶[=] dynamical mass model logMGC/logNgc
$=\mathscr{LR}$(logMd
)
[x] scatter plot with LR
[=] OLS LR summary: for logMd(253): logMGC(0.996) > logNgc(0.923)
luminosity model $\mathscr{LR}$(VMag/KMag
)
revisit with errorbar
# 1-B. `logMd vs logMGC/logNgc` in OLS regression
LRols = sm.OLS(GCpG.logNgc[GCpG.logMd.notna()].values,
GCpG.logMd[GCpG.logMd.notna()].values).fit() # OLS(y, X)
print(LRols.summary2())
Results: Ordinary least squares ================================================================== Model: OLS Adj. R-squared: 0.923 Dependent Variable: y AIC: 566.4509 Date: 2018-06-17 22:09 BIC: 569.9842 No. Observations: 253 Log-Likelihood: -282.23 Df Model: 1 F-statistic: 3037. Df Residuals: 252 Prob (F-statistic): 1.41e-142 R-squared: 0.923 Scale: 0.54723 --------------------------------------------------------------------- Coef. Std.Err. t P>|t| [0.025 0.975] --------------------------------------------------------------------- x1 0.2379 0.0043 55.1067 0.0000 0.2294 0.2464 ------------------------------------------------------------------ Omnibus: 10.559 Durbin-Watson: 1.338 Prob(Omnibus): 0.005 Jarque-Bera (JB): 5.680 Skew: 0.159 Prob(JB): 0.058 Kurtosis: 2.338 Condition No.: 1 ==================================================================
LRols = sm.OLS(GCpG.logMGC[GCpG.logMd.notna()].values,
GCpG.logMd[GCpG.logMd.notna()].values).fit() # OLS(y, X)
print(LRols.summary2())
Results: Ordinary least squares ================================================================== Model: OLS Adj. R-squared: 0.996 Dependent Variable: y AIC: 340.7403 Date: 2018-06-17 22:09 BIC: 344.2737 No. Observations: 253 Log-Likelihood: -169.37 Df Model: 1 F-statistic: 7.156e+04 Df Residuals: 252 Prob (F-statistic): 2.49e-311 R-squared: 0.996 Scale: 0.22424 --------------------------------------------------------------------- Coef. Std.Err. t P>|t| [0.025 0.975] --------------------------------------------------------------------- x1 0.7394 0.0028 267.5077 0.0000 0.7339 0.7448 ------------------------------------------------------------------ Omnibus: 5.236 Durbin-Watson: 1.515 Prob(Omnibus): 0.073 Jarque-Bera (JB): 7.433 Skew: -0.017 Prob(JB): 0.024 Kurtosis: 3.839 Condition No.: 1 ==================================================================
VMag/KMag vs logNgc
¶[x] dynamical mass model $\mathscr{LR}$(logMd
)
[=] luminosity model logNgc
$=\mathscr{LR}$(VMag/KMag
)
[=] scatter plot with LR: strong correlation, present in scatter + regplot
OLS LR summary
revisit with errorbar
# 2-A. `VMag/KMag vs logNgc` in scatter
fig = figure(figsize=fig_size)
ax = fig.add_subplot(211)
ax2 = fig.add_subplot(212)
ax.set_xlim([-12, -27])
ax2.set_xlim([-12, -27])
for i in range(len(GCpG.Mcat.unique())):
mask_MT = (GCpG.Mcat == GCpG.Mcat.unique()[i])
ax.scatter(GCpG.VMag[mask_MT], GCpG.logNgc[mask_MT],**markers[i])
sns.regplot(GCpG.VMag[mask_MT], GCpG.logNgc[mask_MT],
scatter=False, ci=None, line_kws={'lw':1}, ax=ax)
ax2.scatter(GCpG.KMag[mask_MT], GCpG.logNgc[mask_MT],**markers[i])
sns.regplot(GCpG.KMag[mask_MT], GCpG.logNgc[mask_MT],
scatter=False, ci=None, line_kws={'lw':1}, ax=ax2)
sns.regplot(GCpG.VMag,GCpG.logNgc, scatter=False, ci=ci, line_kws={'lw':1}, ax=ax)
sns.regplot(GCpG.KMag,GCpG.logNgc, scatter=False, ci=ci, line_kws={'lw':1}, ax=ax2)
ax.legend()
ax2.legend()
if flag_svfg:
savefig(path.join(path_fig,'LR_Mag.jpeg'), **params_svfg)
VMag/KMag vs logNgc
¶[x] dynamical mass model $\mathscr{LR}$(logMd)
[x] luminosity model logNgc
=$\mathscr{LR}$(VMag/KMag)
[x] scatter plot with LR
[=] OLS LR summary
VMag(0.866) < KMag(0.869) < logMd(0.923/0.996)
logMd(253/418), KMag(344/418)
interpolated from VMag
logMd
requires good photometry dataKMag
revisit
# 2-B. `VMag/KMag vs logNgc` in OLS linear regression
LRols = sm.OLS(GCpG.logNgc.values, GCpG.VMag.values).fit() # OLS(y, X)
print(LRols.summary2())
Results: Ordinary least squares ================================================================== Model: OLS Adj. R-squared: 0.866 Dependent Variable: y AIC: 1043.0025 Date: 2018-06-17 22:10 BIC: 1047.0379 No. Observations: 418 Log-Likelihood: -520.50 Df Model: 1 F-statistic: 2698. Df Residuals: 417 Prob (F-statistic): 3.34e-184 R-squared: 0.866 Scale: 0.70819 --------------------------------------------------------------------- Coef. Std.Err. t P>|t| [0.025 0.975] --------------------------------------------------------------------- x1 -0.1106 0.0021 -51.9461 0.0000 -0.1148 -0.1064 ------------------------------------------------------------------ Omnibus: 24.883 Durbin-Watson: 1.467 Prob(Omnibus): 0.000 Jarque-Bera (JB): 19.434 Skew: 0.434 Prob(JB): 0.000 Kurtosis: 2.397 Condition No.: 1 ==================================================================
LRols = sm.OLS(GCpG.logNgc.values, GCpG.KMag.values).fit() # OLS(y, X)
print(LRols.summary2())
Results: Ordinary least squares ================================================================== Model: OLS Adj. R-squared: 0.870 Dependent Variable: y AIC: 1030.4039 Date: 2018-06-17 22:10 BIC: 1034.4393 No. Observations: 418 Log-Likelihood: -514.20 Df Model: 1 F-statistic: 2794. Df Residuals: 417 Prob (F-statistic): 6.21e-187 R-squared: 0.870 Scale: 0.68716 --------------------------------------------------------------------- Coef. Std.Err. t P>|t| [0.025 0.975] --------------------------------------------------------------------- x1 -0.0980 0.0019 -52.8557 0.0000 -0.1017 -0.0944 ------------------------------------------------------------------ Omnibus: 22.487 Durbin-Watson: 1.448 Prob(Omnibus): 0.000 Jarque-Bera (JB): 19.376 Skew: 0.453 Prob(JB): 0.000 Kurtosis: 2.458 Condition No.: 1 ==================================================================
[x] dynamical mass model logMGC/logNgc
=$\mathscr{LR}$(logMd
)
[x] luminosity model logNgc
=$\mathscr{LR}$(VMag/KMag)
[=] revisit with errorbar, present in 2 errorbars
# 3-A. revisit `logMd vs logMGC/logNgc` with errorbar
fig = figure(figsize=fig_size)
ax = fig.add_subplot(111)
ax2 = ax.twinx()
ax.set_xlim([7.5,13])
for i in range(len(GCpG.Mcat.unique())):
mask_MT = (GCpG.Mcat == GCpG.Mcat.unique()[i])
ax.errorbar(GCpG.logMd[mask_MT], GCpG.logMGC[mask_MT],
xerr=GCpG.e_logMd[mask_MT], yerr=GCpG.e_logMGC[mask_MT], **fmts[i])
ax2.errorbar(GCpG.logMd[mask_MT], GCpG.logNgc[mask_MT],
xerr=GCpG.e_logMd[mask_MT], yerr=GCpG.e_Ngc[mask_MT].apply(log10), **fmts[i])
ax.set_ylabel('logMGC $[M_\odot]$')
ax2.set_ylabel('logNgc')
ax.set_xlabel('logMd $[M_\odot]$')
sns.regplot(GCpG.logMd,GCpG.logMGC, scatter=False, ci=ci, line_kws={'lw':1}, ax=ax)
sns.regplot(GCpG.logMd,GCpG.logNgc, scatter=False, ci=ci, line_kws={'lw':1}, ax=ax2)
ax.legend()
if flag_svfg:
savefig(path.join(path_fig,'LR_Mdyn_err.jpeg'), **params_svfg)
LRols = sm.OLS(GCpG.logNgc.values, GCpG.KMag.values).fit() # OLS(y, X)
print(LRols.summary2())
Results: Ordinary least squares ================================================================== Model: OLS Adj. R-squared: 0.870 Dependent Variable: y AIC: 1030.4039 Date: 2018-06-17 22:10 BIC: 1034.4393 No. Observations: 418 Log-Likelihood: -514.20 Df Model: 1 F-statistic: 2794. Df Residuals: 417 Prob (F-statistic): 6.21e-187 R-squared: 0.870 Scale: 0.68716 --------------------------------------------------------------------- Coef. Std.Err. t P>|t| [0.025 0.975] --------------------------------------------------------------------- x1 -0.0980 0.0019 -52.8557 0.0000 -0.1017 -0.0944 ------------------------------------------------------------------ Omnibus: 22.487 Durbin-Watson: 1.448 Prob(Omnibus): 0.000 Jarque-Bera (JB): 19.376 Skew: 0.453 Prob(JB): 0.000 Kurtosis: 2.458 Condition No.: 1 ==================================================================
logMd vs logMGC/logNgc
and VMag/KMag vs logNgc
¶[x] dynamical mass model logMGC/logNgc
=$\mathscr{LR}$(logMd
)
[x] luminosity model logNgc
=$\mathscr{LR}$(VMag/KMag)
[=] revisit with errorbar, present in 2 errorbars
# 3. revisit `VMag/KMag vs logNgc` with errorbar
fig = figure(figsize=fig_size)
ax = fig.add_subplot(111)
ax2 = ax.twiny()
ax.set_xlim([-12, -27])
ax2.set_xlim([-12, -27])
sub_axes = axes([0.15, 0.42, 0.4, 0.4])
for i in range(len(GCpG.Mcat.unique())):
mask_MT = (GCpG.Mcat == GCpG.Mcat.unique()[i])
ax.errorbar(GCpG.VMag[mask_MT], GCpG.logNgc[mask_MT],
xerr=GCpG.e_VMag[mask_MT], yerr=GCpG.e_Ngc[mask_MT].apply(log10), **fmts[i])
ax2.errorbar(GCpG.KMag[mask_MT], GCpG.logNgc[mask_MT],
xerr=GCpG.e_KMag[mask_MT], yerr=GCpG.e_Ngc[mask_MT].apply(log10), **fmts[i])
sub_axes.scatter(GCpG.VMag[mask_MT], GCpG.logNgc[mask_MT], **markers[i])
sns.regplot(GCpG.VMag,GCpG.logNgc, scatter=False, ci=ci, line_kws={'lw':1}, ax=ax)
sns.regplot(GCpG.KMag,GCpG.logNgc, scatter=False, ci=ci, line_kws={'lw':1}, ax=ax2)
ax.set_ylabel('logNgc')
ax.set_xlabel('VMag')
ax2.set_xlabel('KMag')
sub_axes.set_xlim([-10, -26])
sub_axes.grid(False)
sub_axes.axis('off')
sub_axes.legend()
sub_axes.set_title('Fig. 3, Rodriguez et al. (2015)')
if flag_svfg:
savefig(path.join(path_fig,'LR_Mag_err.jpeg'), **params_svfg)
LRols = sm.OLS(GCpG.logNgc.values, GCpG.KMag.values).fit() # OLS(y, X)
print(LRols.summary2())
Results: Ordinary least squares ================================================================== Model: OLS Adj. R-squared: 0.870 Dependent Variable: y AIC: 1030.4039 Date: 2018-06-17 22:10 BIC: 1034.4393 No. Observations: 418 Log-Likelihood: -514.20 Df Model: 1 F-statistic: 2794. Df Residuals: 417 Prob (F-statistic): 6.21e-187 R-squared: 0.870 Scale: 0.68716 --------------------------------------------------------------------- Coef. Std.Err. t P>|t| [0.025 0.975] --------------------------------------------------------------------- x1 -0.0980 0.0019 -52.8557 0.0000 -0.1017 -0.0944 ------------------------------------------------------------------ Omnibus: 22.487 Durbin-Watson: 1.448 Prob(Omnibus): 0.000 Jarque-Bera (JB): 19.376 Skew: 0.453 Prob(JB): 0.000 Kurtosis: 2.458 Condition No.: 1 ==================================================================
VMag
, normalized to $M_V$ = −15, present in scatter[=] LOWESS regression for $S_N$ vs VMag
, present in plot
error comparison
# 1. compute GC SN and represent
if 'Sn' not in GCpG.columns:
GCpG['Sn'] = GCpG.Ngc*10**(0.4*(GCpG.VMag+15.0))
GCpG['e_Sn'] = GCpG.e_Ngc*10**(0.4*(GCpG.VMag+15.0))
fig = figure(figsize=fig_size)
ax = fig.add_subplot(111)
ax.set_ylim([-2,40])
ax.set_xlim([-10,-25])
ax.set_xlabel("VMag")
ax.set_ylabel("GC SN")
for i in range(len(GCpG.Mcat.unique())):
mask_MT = (GCpG.Mcat == GCpG.Mcat.unique()[i])
ax.scatter(GCpG.VMag[mask_MT], GCpG.Sn[mask_MT], **markers[i])
ax.set_title('Fig. 10, Harris et al. (2013)')
# 2. lowess regression
lowess = sm.nonparametric.lowess(GCpG.Sn, GCpG.VMag, frac=0.36) # lowess(y, X)
f_Sn = interp1d(lowess[:,0],lowess[:,1],bounds_error=False)
ax.plot(GCpG.VMag.sort_values(), f_Sn(GCpG.VMag.sort_values()))
ax.plot(GCpG.VMag.sort_values(), f_Sn(GCpG.VMag.sort_values()),'*', markersize=16,
markevery=len(GCpG)-1,label='VMag $\in$ [%.2f,%.2f]' % (GCpG.VMag.max(), GCpG.VMag.min()))
ax.legend(loc=0)
if flag_svfg:
savefig(path.join(path_fig, 'GCSN.jpeg'), **params_svfg)
LRols = sm.OLS(GCpG.logNgc.values, GCpG.KMag.values).fit() # OLS(y, X)
print(LRols.summary2())
Results: Ordinary least squares ================================================================== Model: OLS Adj. R-squared: 0.870 Dependent Variable: y AIC: 1030.4039 Date: 2018-06-17 22:10 BIC: 1034.4393 No. Observations: 418 Log-Likelihood: -514.20 Df Model: 1 F-statistic: 2794. Df Residuals: 417 Prob (F-statistic): 6.21e-187 R-squared: 0.870 Scale: 0.68716 --------------------------------------------------------------------- Coef. Std.Err. t P>|t| [0.025 0.975] --------------------------------------------------------------------- x1 -0.0980 0.0019 -52.8557 0.0000 -0.1017 -0.0944 ------------------------------------------------------------------ Omnibus: 22.487 Durbin-Watson: 1.448 Prob(Omnibus): 0.000 Jarque-Bera (JB): 19.376 Skew: 0.453 Prob(JB): 0.000 Kurtosis: 2.458 Condition No.: 1 ==================================================================
VMag
[x] LOWESS regression for $S_N$ vs VMag
[=] error comparison, zoom in view, justificationin errorbar
VMag
range [-11.17, -24.19]
: lower-bound# 3. error comparison, zoomed in
fig = figure(figsize=fig_size)
ax = fig.add_subplot(111)
ax.axis([-10,-25,-2,40])
ax.set_xlabel("VMag")
ax.set_ylabel("GC SN")
axins = zoomed_inset_axes(ax, 3, loc=9)
axins.axis([-18, -20, -0.2, 4.8])
for i in range(len(GCpG.Mcat.unique())):
mask_MT = (GCpG.Mcat == GCpG.Mcat.unique()[i])
ax.scatter(GCpG.VMag[mask_MT], GCpG.Sn[mask_MT], alpha=0.8, **markers[i])
ax.errorbar(GCpG.VMag[mask_MT], GCpG.Sn[mask_MT],
xerr=GCpG.e_VMag[mask_MT], yerr=GCpG.e_Sn[mask_MT], **fmts[i])
axins.scatter(GCpG.VMag[mask_MT], GCpG.Sn[mask_MT], alpha=0.8, **markers[i])
axins.errorbar(GCpG.VMag[mask_MT], GCpG.Sn[mask_MT],
xerr=GCpG.e_VMag[mask_MT], yerr=GCpG.e_Sn[mask_MT], **fmts[i])
ax.legend(GCpG.Mcat.unique())
ax.add_patch(patches.Rectangle((-20,-0.2),2,5,linewidth=1,edgecolor='k',facecolor='none'))
axins.add_patch(patches.Rectangle((-19.99,-0.19),1.98,4.96,linewidth=1,edgecolor='k',facecolor='none'))
ax.set_title('Fig. 10, Harris et al. (2013)')
axins.grid(False)
lowess = sm.nonparametric.lowess(GCpG.Sn, GCpG.VMag, frac=0.36) # lowess(y, X)
f_Sn = interp1d(lowess[:,0],lowess[:,1],bounds_error=False)
ax.plot(GCpG.VMag.sort_values(), f_Sn(GCpG.VMag.sort_values()))
axins.plot(GCpG.VMag.sort_values(), f_Sn(GCpG.VMag.sort_values()))
_, pp1,pp2 = mark_inset(ax,axins,loc1=1, loc2=1, color='k')
pp1.loc1=1
pp1.loc2=2
pp2.loc1=3
pp2.loc2=4
if flag_svfg:
savefig(path.join(path_fig, 'GCSN_err.jpeg'), **params_svfg)
LRols = sm.OLS(GCpG.logNgc.values, GCpG.KMag.values).fit() # OLS(y, X)
print(LRols.summary2())
Results: Ordinary least squares ================================================================== Model: OLS Adj. R-squared: 0.870 Dependent Variable: y AIC: 1030.4039 Date: 2018-06-17 22:10 BIC: 1034.4393 No. Observations: 418 Log-Likelihood: -514.20 Df Model: 1 F-statistic: 2794. Df Residuals: 417 Prob (F-statistic): 6.21e-187 R-squared: 0.870 Scale: 0.68716 --------------------------------------------------------------------- Coef. Std.Err. t P>|t| [0.025 0.975] --------------------------------------------------------------------- x1 -0.0980 0.0019 -52.8557 0.0000 -0.1017 -0.0944 ------------------------------------------------------------------ Omnibus: 22.487 Durbin-Watson: 1.448 Prob(Omnibus): 0.000 Jarque-Bera (JB): 19.376 Skew: 0.453 Prob(JB): 0.000 Kurtosis: 2.458 Condition No.: 1 ==================================================================
[=] bandpass conversion: B/I/K
in GWGC
from White et al. (2011) to V/K
in GCSN model from Harris et al. (2013)
[=] GWGC luminosities, Abs_Mag_B/I(7,877/5,606), App_Mag_B/I/K(7,877/5,606/2,995), and err_Abs/App_Mag_B/I/K
:
explore correlations
fit conversion constant
choose most probable value
revert to Ngc
GC distribution
# 1-A. statistics of GWGC luminosity
display(LoGal.loc[:,LoGal.columns[LoGal.columns.str.contains('Mag')]] \
.describe().loc[['count','min','max','mean','std'],:].transpose().merge(
LoGal.loc[:,LoGal.columns[LoGal.columns.str.contains('Mag')]].isna().sum(axis=0).to_frame('NaNs'),
left_index=True,right_index=True).transpose())
App_Mag_B | err_App_Mag_B | Abs_Mag_B | err_Abs_Mag_B | App_Mag_I | err_App_Mag_I | Abs_Mag_I | err_Abs_Mag_I | App_Mag_K | err_App_Mag_K | |
---|---|---|---|---|---|---|---|---|---|---|
count | 7877.000000 | 7877.000000 | 7877.000000 | 7877.000000 | 5606.000000 | 5606.000000 | 5606.000000 | 0.0 | 2995.000000 | 2287.00000 |
min | -4.600000 | 0.040000 | -22.590000 | 0.070000 | 5.730000 | 0.030000 | -25.520000 | NaN | 7.271000 | 0.02000 |
max | 25.260000 | 3.020000 | -4.970000 | 3.020000 | 30.440000 | 6.880000 | -1.680000 | NaN | 15.978000 | 0.29000 |
mean | 14.981126 | 0.377188 | -16.210734 | 0.387227 | 14.158673 | 0.174930 | -17.158776 | NaN | 12.968526 | 0.10208 |
std | 2.280469 | 0.176294 | 2.451796 | 0.175116 | 2.463852 | 0.317297 | 2.627928 | NaN | 1.594678 | 0.04896 |
NaNs | 1069.000000 | 1069.000000 | 1069.000000 | 1069.000000 | 3340.000000 | 3340.000000 | 3340.000000 | 8946.0 | 5951.000000 | 6659.00000 |
img = msno.matrix(LoGal.loc[:,LoGal.columns[LoGal.columns.str.contains('Mag')]], inline=False)
if flag_svfg:
img.savefig(path.join(path_fig, 'GWGC_mags_NaN.jpeg'), **params_svfg)
[=] bandpass conversion: B/I/K
in GWGC
from White et al. (2011) to V/K
in GCSN model from Harris et al. (2013)
[x] GWGC luminosities
[=] explore correlations, explain in pairplots with heatmap
xxx & err_xxx
: intuitive, not interestingabs_x & app_x
: b'cos $app\_x-abs\_X=5\log_{10}(d)-5$err_Abs_x & err_App_x
: of course, passfit conversion constant
choose most probable value
revert to Ngc
GC distribution
# 1-B. correlations between bands
if not flag_skip:
corr = pp_heat(LoGal.loc[:,LoGal.columns[LoGal.columns.str.contains('Mag')]], 0.8, 'pp_GWGC_mags.jpeg')
else:
ax = figure(figsize=sfig_size).add_subplot(111)
ax.imshow(imread(path.join(path_fig, 'pp_GWGC_mags.jpeg')))
ax.grid(False)
[=] bandpass conversion: B/I/K
in GWGC
from White et al. (2011) to V/K
in GCSN model from Harris et al. (2013)
[x] GWGC luminosities
[x] explore correlations
[=] fit conversion constant, 197 galaxies in both catalogs, inspect in mag/color scatter
[Abs_Mag_B (7,877), err_Abs_Mag_B (7,877)] > [Abs_Mag_I (5,606), err_Abs_Mag_I (0)]
: => prefer B over I for Ngc estimation B/I in GWGC (197/148) vs V/K in GCSN
=> almost homogenous, prefer B for fitB/I in GWGC (197/148) vs V-K in GCSN
=> color/Mcat didn't help (unequal axis)choose most probable value
revert to Ngc
GC distribution
# 1-C. fit conversion constant, color scatter
idx_GCpG = GCpG.Name.isin(LoGal.Name)
idx_LoGal = LoGal.Name.isin(GCpG.Name)
nullfmt = NullFormatter() # no labels
left, width = 0.1, 0.65
bottom, height = 0.1, 0.65
bottom_h = left_h = left + width + 0.02
rect_main = [left, bottom, width, height]
rect_histx = [left, bottom_h, width, 0.2]
rect_histy = [left_h, bottom, 0.2, height]
figure(figsize=sfig_size)
ax_histx = axes(rect_histx)
ax_histy = axes(rect_histy)
ax_histx.xaxis.set_major_formatter(nullfmt)
ax_histy.yaxis.set_major_formatter(nullfmt)
ax_main = axes(rect_main)
ax_main.set_xlabel('VMag in GCpG from Harris et al. (2013)')
ax_main.set_ylabel('BMag in GWGC from White et al. (2011)')
ax_main.scatter(GCpG.VMag[idx_GCpG], LoGal.Abs_Mag_B[idx_LoGal],
c=GCpG.iMType[idx_GCpG], s=40*(2+log2(30/GCpG.Dist[idx_GCpG])), cmap=cmap)
ax_main.errorbar(GCpG.VMag[idx_GCpG], LoGal.Abs_Mag_B[idx_LoGal],
xerr=GCpG.e_VMag[idx_GCpG], yerr=LoGal.err_Abs_Mag_B[idx_LoGal],
fmt='+', elinewidth=3, alpha=0.6)
xx, yy = mgrid[ax_main.get_xlim()[0]:ax_main.get_xlim()[1]:100j,
ax_main.get_ylim()[0]-0.5:ax_main.get_ylim()[1]+0.5:100j]
positions = vstack([xx.ravel(), yy.ravel()])
values = vstack([GCpG.VMag[idx_GCpG].values, LoGal.Abs_Mag_B[idx_LoGal].values])
kernel = st.gaussian_kde(values)
zz = reshape(kernel(positions).T, xx.shape)
cfset = ax_main.contourf(xx, yy, zz, cmap='Greens', alpha=0.3) # Contourf plot
cset = ax_main.contour(xx, yy, zz, colors='gray', alpha=0.7) # Contour plot
ax_main.clabel(cset, inline=1, fontsize=16)
ax_main.legend(['Galaxy in both catalog: %d' % idx_GCpG.sum()], fontsize=16)
ax_main.set_xlim([-23,-16])
ax_main.set_ylim([-22,-15])
ax_histx.set_xlim(ax_main.get_xlim())
ax_histy.set_ylim(ax_main.get_ylim())
binwidth = 0.25
xymax = max([max(fabs(GCpG.VMag[idx_GCpG])), max(fabs(LoGal.Abs_Mag_B[idx_LoGal]))])
lim = (int(xymax/binwidth) + 1) * binwidth
bins = arange(-lim, lim + binwidth, binwidth)
_ = ax_histx.hist(GCpG.VMag[idx_GCpG], bins=bins)
_ = ax_histy.hist(LoGal.Abs_Mag_B[idx_LoGal], bins=bins, orientation='horizontal')
if flag_svfg:
savefig(path.join(path_fig,'joint_mags.jpeg'), **params_svfg)
# 1-C. fit conversion constant, color scatter
figure(figsize=sfig_size)
ax_histx = axes(rect_histx)
ax_histy = axes(rect_histy)
ax_histx.xaxis.set_major_formatter(nullfmt)
ax_histy.yaxis.set_major_formatter(nullfmt)
ax_main = axes(rect_main)
ax_main.set_xlabel('VMag-KMag in GCpG from Harris et al. (2013)')
ax_main.set_ylabel('BMag in GWGC from White et al. (2011)')
ax_main.scatter(GCpG.VMag[idx_GCpG]-GCpG.KMag[idx_GCpG], LoGal.Abs_Mag_B[idx_LoGal],
c=GCpG.iMType[idx_GCpG], s=40*(2+log2(30/GCpG.Dist[idx_GCpG])), cmap=cmap)
ax_main.errorbar(GCpG.VMag[idx_GCpG]-GCpG.KMag[idx_GCpG], LoGal.Abs_Mag_B[idx_LoGal],
xerr=GCpG.e_VMag[idx_GCpG]-GCpG.e_KMag[idx_GCpG], yerr=LoGal.err_Abs_Mag_B[idx_LoGal],
fmt='+', elinewidth=3, alpha=0.6)
xx, yy = mgrid[ax_main.get_xlim()[0]:ax_main.get_xlim()[1]:100j,
ax_main.get_ylim()[0]-0.5:ax_main.get_ylim()[1]+0.5:100j]
positions = vstack([xx.ravel(), yy.ravel()])
values = vstack([GCpG.VMag[idx_GCpG].values-GCpG.KMag[idx_GCpG].values,
LoGal.Abs_Mag_B[idx_LoGal].values])
kernel = st.gaussian_kde(values)
zz = reshape(kernel(positions).T, xx.shape)
cfset = ax_main.contourf(xx, yy, zz, cmap='Greens', alpha=0.3) # Contourf plot
cset = ax_main.contour(xx, yy, zz, colors='gray', alpha=0.7) # Contour plot
ax_main.clabel(cset, inline=1, fontsize=16)
ax_main.legend(['Galaxy in both catalog: %d' % idx_GCpG.sum()], fontsize=16)
ax_main.set_xlim([1,4])
ax_main.set_ylim([-22,-14])
ax_histx.set_xlim(ax_main.get_xlim())
ax_histy.set_ylim(ax_main.get_ylim())
binwidth = 0.25
xymax = max([max(fabs(GCpG.VMag[idx_GCpG])), max(fabs(LoGal.Abs_Mag_B[idx_LoGal]))])
lim = (int(xymax/binwidth) + 1) * binwidth
bins = arange(-lim, lim + binwidth, binwidth)
_ = ax_histx.hist(GCpG.VMag[idx_GCpG]-GCpG.KMag[idx_GCpG], bins=bins)
_ = ax_histy.hist(LoGal.Abs_Mag_B[idx_LoGal], bins=bins, orientation='horizontal')
if flag_svfg:
savefig(path.join(path_fig,'joint_color.jpeg'), **params_svfg)
[=] bandpass conversion: B/I/K
in GWGC
from White et al. (2011) to V/K
in GCSN model from Harris et al. (2013)
[x] GWGC luminosities
[x] explore correlations
[x] fit conversion constant: B/I in GWGC (197/148) vs V/K in GC SN
on 197 shared galaxies
[=] choose most probable value, based on kdeplot
revert to Ngc
GC distribution
# 1-D. most probable value from kdeplot
f_joint, (ax1, ax2, ax3) = subplots(3, sharex=True, sharey=True, figsize=sfig_size)
ax1.set_xlim([-25,-10])
p_v = sns.kdeplot(GCpG.VMag[idx_GCpG], shade=True, ax=ax1, color='g')
x,y = p_v.get_lines()[0].get_data()
V_max = x[y.argmax()]
ax1.vlines(V_max, 0, y.max(), linestyle='dashed')
ax1.legend(['$\mathrm{VMag}_{\max}$ = %.2f, from Harris et al. (2013)' % V_max])
p_b = sns.kdeplot(LoGal.Abs_Mag_B[idx_LoGal], shade=True, ax=ax2, color='b')
x,y = p_b.get_lines()[0].get_data()
B_max = x[y.argmax()]
ax2.vlines(B_max, 0, y.max(), linestyle='dashed')
ax2.legend(['$\mathrm{BMag}_{\max}$ = %.2f, from White et al. (2011)' % B_max])
p_i = sns.kdeplot(LoGal.Abs_Mag_I[idx_LoGal], shade=True, ax=ax3, color='r')
x,y = p_i.get_lines()[0].get_data()
I_max = x[y.argmax()]
ax3.vlines(I_max, 0, y.max(), linestyle='dashed')
ax3.legend(['$\mathrm{IMag}_{\max}$ = %.2f, from White et al. (2011)' % I_max])
f_joint.subplots_adjust(hspace=0)
if flag_svfg:
savefig(path.join(path_fig,'mags_max.jpeg'), **params_svfg)
[x] bandpass conversion: B/I/K
in GWGC
from White et al. (2011) to V/K
in GCSN model from Harris et al. (2013)
[x] GWGC luminosities
[x] explore correlations
[x] fit conversion constant: B/I in GWGC (197/148) vs V/K in GC SN
on 197 shared galaxies
[x] choose most probable value, based on kdeplot
[=] revert to Ngc:
BMag/IMag in GWGC
to VMag in GCSN
: B to V (7,877) + I to V (extra 32) = 7,909/8,946658,910
, over 8,946 galaxies within 30 Mpcmodest
GC distribution
# 2. revert to Ngc
# 2-A. convert BMag/IMag to VMag
LoGal['VMag']=pd.Series(LoGal.Abs_Mag_B + V_max - B_max, index=LoGal.index)
LoGal['e_VMag']=pd.Series(LoGal.err_Abs_Mag_B
- GCpG[idx_GCpG].e_VMag.median()
+ LoGal[idx_LoGal].err_Abs_Mag_B.median())
_ = LoGal.Abs_Mag_B.isnull() & LoGal.Abs_Mag_I.notnull()
LoGal.VMag[_] = LoGal.Abs_Mag_I[_] + V_max - I_max
# 2-B. apply GCSN to get Ngc
LoGal['Ngc'] = pd.Series(map(lambda x: 0 if isnan(f_Sn(x))
else round(f_Sn(x)/10**(0.4*(x+15))), LoGal.VMag),
index=LoGal.index)
LoGal['e_Ngc'] = pd.Series(map(lambda x: LoGal.Ngc[x] if isnan(f_Sn(LoGal.VMag[x]-LoGal.e_VMag[x]))
else abs(LoGal.Ngc[x] - (f_Sn(LoGal.VMag[x]-LoGal.e_VMag[x]) /
10**(0.4*(LoGal.VMag[x]-LoGal.e_VMag[x]+15)))),
LoGal.index), index=LoGal.index)
# 2-C. statistics of result and 5 'Interpolated Galaxy Samples'
# print('Total GC Population in the Local Universe (30 Mpc) is %d' % LoGal.Ngc.sum())
_ = ['Name', 'RA', 'Dec', 'VMag', 'Dist', 'Abs_Mag_B', 'Abs_Mag_I', 'Ngc','e_Ngc', 'iMType']
display(pd.concat([LoGal.loc[:,_].describe(include='all').loc[['count','min','max','mean','std']],
LoGal.loc[:,_].sample(5,random_state=427)]).loc[:,_])
Name | RA | Dec | VMag | Dist | Abs_Mag_B | Abs_Mag_I | Ngc | e_Ngc | iMType | |
---|---|---|---|---|---|---|---|---|---|---|
count | 8946 | 8946.000000 | 8946.000000 | 7909.000000 | 8946.000000 | 7877.000000 | 5606.000000 | 8946.000000 | 8946.000000 | 8946.000000 |
min | NaN | 0.005940 | -89.334590 | -23.478214 | 0.010000 | -22.590000 | -25.520000 | 0.000000 | 0.000000 | 0.000000 |
max | NaN | 23.996680 | 86.131800 | -5.858214 | 29.986000 | -4.970000 | -1.680000 | 11608.000000 | 7187.308776 | 6.000000 |
mean | NaN | 10.973250 | 3.455383 | -17.094336 | 19.011495 | -16.210734 | -17.158776 | 74.085848 | 46.053841 | 0.916387 |
std | NaN | 5.277991 | 34.889882 | 2.452026 | 6.616150 | 2.451796 | 2.627928 | 262.963677 | 190.324592 | 0.936047 |
168135 | SDSSJ142852.74+123454.0 | 14.481330 | 12.581660 | NaN | 28.958000 | NaN | NaN | 0.000000 | 0.000000 | 0.000000 |
168242 | SDSSJ144213.99+333517.4 | 14.703880 | 33.588320 | NaN | 24.528000 | NaN | NaN | 0.000000 | 0.000000 | 0.000000 |
146627 | SDSSJ080937.39+413526.6 | 8.160380 | 41.590710 | -14.168214 | 11.625000 | -13.280000 | -12.870000 | 5.000000 | 2.037453 | 0.000000 |
24031 | UGC06251 | 11.223940 | 53.595040 | -17.718214 | 18.450000 | -16.830000 | NaN | 25.000000 | 7.388049 | 1.000000 |
171592 | SDSSJ151605.06+561614.7 | 15.268080 | 56.270820 | -16.628214 | 9.875000 | -15.740000 | -16.790000 | 14.000000 | 3.364954 | 0.000000 |
[x] bandpass conversion: B/I/K
in GWGC
from White et al. (2011) to V/K
in GCSN model from Harris et al. (2013)
[x] GWGC luminosities
[x] explore correlations
[x] fit conversion constant: B/I in GWGC (197/148) vs V/K in GC SN
on 197 shared galaxies
[x] choose most probable value, based on kdeplot
[x] revert to Ngc: 7,909/8,946 with Ngc
658,910
, over 8,946 galaxies within 30 Mpcmodest
[=] GC distribution
# 2-D. all-sky zoom in view
fig = figure(figsize=fig_size)
ax = fig.add_subplot(111, projection="mollweide")
ax.set_xticklabels(['2h','4h','6h','8h','10h','12h','14h','16h','18h','20h','22h'])
ax.grid(True)
vmin = 0.1
host_GC = LoGal.Ngc > 0
# RA: [0h-24h] -> [-pi,pi]; Dec: [-90°,90°] -> [-pi/2,pi/2]
ax.scatter((LoGal.RA[host_GC]-12)/12*pi, LoGal.Dec[host_GC]/180*pi,
s=5*(2+log2(30./LoGal.Dist[host_GC])),c=LoGal.Ngc[host_GC],alpha=0.5,edgecolors=None,
norm=LogNorm(vmin=vmin,vmax=LoGal.Ngc.max()),cmap=cmap)
ax.scatter((LoGal.RA[-host_GC]-12)/12*pi, LoGal.Dec[-host_GC]/180*pi,
s=2*log2(30./LoGal.Dist[-host_GC]),c='k',alpha=0.5,edgecolors=None)
axins = zoomed_inset_axes(ax, 3.2, loc=5, bbox_to_anchor=(1, 0.5), bbox_transform=ax.figure.transFigure)
axins.axis([-0.35, 0.3, -0.1, 0.4])
cax = axins.scatter((LoGal.RA[host_GC]-12)/12*pi, LoGal.Dec[host_GC]/180*pi,
s=5*(2+log2(30./LoGal.Dist[host_GC])),c=LoGal.Ngc[host_GC],alpha=0.5,edgecolors=None,
norm=LogNorm(vmin=vmin,vmax=LoGal.Ngc.max()),cmap=cmap)
axins.scatter((LoGal.RA[-host_GC]-12)/12*pi,LoGal.Dec[-host_GC]/180*pi,
s=2*log2(30./LoGal.Dist[-host_GC]),c='k',alpha=0.5,edgecolors=None)
axins.set_xticklabels([])
axins.set_yticklabels([])
cbar = fig.colorbar(cax, orientation='horizontal',fraction=0.07,anchor=(0.5,0.0))
ax.add_patch(patches.Rectangle((-0.35,-0.1),0.65,0.5,linewidth=1,edgecolor='k',facecolor='none'))
axins.add_patch(patches.Rectangle((-0.349,-0.096),0.646,0.495,linewidth=1,edgecolor='k',facecolor='none'))
anno_kws = {'width':1, 'headwidth':1, 'headlength':1, 'shrink':0.05}
ax.annotate('',xy=(2.44,0.80),xytext=(-0.48,0.4),arrowprops=anno_kws)
ax.annotate('',xy=(2.50,-.84),xytext=(-0.48,-0.07),arrowprops=anno_kws)
ax.set_title('Total GC Population in the Local Universe (30 Mpc) is %d' % LoGal.Ngc.sum())
if flag_svfg:
savefig(path.join(path_fig, 'all-sky.jpeg'), **params_svfg)
Name | RA | Dec | VMag | Dist | Abs_Mag_B | Abs_Mag_I | Ngc | e_Ngc | iMType | |
---|---|---|---|---|---|---|---|---|---|---|
count | 8946 | 8946.000000 | 8946.000000 | 0.0 | 8946.000000 | 7877.000000 | 5606.000000 | 0.0 | 0.0 | 8946.000000 |
min | NaN | 0.005940 | -89.334590 | NaN | 0.010000 | -22.590000 | -25.520000 | NaN | NaN | 0.000000 |
max | NaN | 23.996680 | 86.131800 | NaN | 29.986000 | -4.970000 | -1.680000 | NaN | NaN | 6.000000 |
mean | NaN | 10.973250 | 3.455383 | NaN | 19.011495 | -16.210734 | -17.158776 | NaN | NaN | 0.916387 |
std | NaN | 5.277991 | 34.889882 | NaN | 6.616150 | 2.451796 | 2.627928 | NaN | NaN | 0.936047 |
168135 | SDSSJ142852.74+123454.0 | 14.481330 | 12.581660 | NaN | 28.958000 | NaN | NaN | NaN | NaN | 0.000000 |
168242 | SDSSJ144213.99+333517.4 | 14.703880 | 33.588320 | NaN | 24.528000 | NaN | NaN | NaN | NaN | 0.000000 |
146627 | SDSSJ080937.39+413526.6 | 8.160380 | 41.590710 | NaN | 11.625000 | -13.280000 | -12.870000 | NaN | NaN | 0.000000 |
24031 | UGC06251 | 11.223940 | 53.595040 | NaN | 18.450000 | -16.830000 | NaN | NaN | NaN | 1.000000 |
171592 | SDSSJ151605.06+561614.7 | 15.268080 | 56.270820 | NaN | 9.875000 | -15.740000 | -16.790000 | NaN | NaN | 0.000000 |
ax = figure(figsize=(16,20)).add_subplot(111)
ax.imshow(imread('Fig_D/mont.png'))
ax.grid(False)
_ = ax.axis('off')
General setup
Variations of parameters
Age spread: the Hubble time is assumed to be 13.5 Gyr
GC simulations on TACC
GClib LIBRARY: TO BE RELEASED, 652,196 -> 662,772
from scipy.stats import gaussian_kde
def generate_rand_from_pdf(pdf, x_grid):
cdf = np.cumsum(pdf)
cdf = cdf / cdf[-1]
random.seed(427)
values = random.rand(int(sum(LoGal.Ngc)))
value_bins = searchsorted(cdf, values)
random_from_cdf = x_grid[value_bins]
return random_from_cdf
fig = figure(figsize=(12,8))
ax = fig.add_subplot(111)
# for kernel size (bw), see http://stats.stackexchange.com/questions/90656/kernel-bandwidth-scotts-vs-silvermans-rules
age_kde = sns.kdeplot(GCage.Age,kernel='gau',bw='silverman',ax=ax,label='MWGC Age KDE')
sns.distplot(GCage.Age,hist=True,norm_hist=True,kde=False,rug=True,label='',ax=ax)
# number of GC needs age
x_grid = linspace(min(GCage.Age)-1, max(GCage.Age)+1, len(age_kde.get_lines()[0].get_data()[1]))
GCage_from_kde = generate_rand_from_pdf(age_kde.get_lines()[0].get_data()[1], x_grid)
# cut off at 13.5 Gyrs
random.seed(427)
GCage_from_kde[GCage_from_kde>=13.5]=random.choice(GCage_from_kde[GCage_from_kde<13.5],sum(GCage_from_kde>=13.5))
random.seed(427)
sns.distplot(GCage_from_kde[randint(0, int(sum(LoGal.Ngc)),int(sum(LoGal.Ngc)))],
hist=False,color='g',ax=ax,label='GC Age $\in$ [%.2f, %.2f]'%(min(GCage_from_kde),max(GCage_from_kde)))
legend(loc=2)
xlim([8,14])
ylim([0,0.7])
hlines(0.5, 8.5, 11.5, 'k', lw=4,alpha=0.2)
hlines(0.5,9,9.5,'r',lw=4,alpha=0.5)
hlines(0.5,9,11.5,'b',lw=4,alpha=0.5)
hlines(0.5,9.5,10.5,'g',lw=4,alpha=0.5)
# vlines(8.5, 0.45, 0.55, 'k', lw=2, alpha=0.2)
vlines(11.5, 0.5, 0.55, 'k', lw=2, alpha=0.2)
text(9.05,0.51,'$T_{eject}$')
vlines(9.5, 0.5, 0.53, 'r',lw=2,alpha=0.5)
text(9.75,0.51,'$T_{merge}$')
vlines(9,0.5,0.55,'r', lw=2,alpha=0.5)
annotate('$Age$',xy=(9,0.54),xytext=(10,0.5475),arrowprops=dict(facecolor='red', shrink=0.05))
annotate('',xy=(11.5,0.54),xytext=(10.3,0.54),arrowprops=dict(facecolor='red', shrink=0.05))
annotate('0 Gyr',xy=(8.5,0.5),xytext=(8.5,0.44),arrowprops=dict(facecolor='gray', shrink=0.01))
annotate('13.5 Gyr',xy=(11.5,0.5),xytext=(11.5,0.44),arrowprops=dict(facecolor='gray', shrink=0.01))
annotate('GC birth',xy=(9,0.5),xytext=(9,0.4),arrowprops=dict(facecolor='blue', shrink=0.05))
annotate('BBH ejected',xy=(9.5,0.5),xytext=(9.5,0.44),arrowprops=dict(facecolor='black', shrink=0.05))
annotate('BBH merged',xy=(10.5,0.5),xytext=(10.5,0.44),arrowprops=dict(facecolor='black', shrink=0.05))
if flag_svfg:
savefig(path.join(path_fig, 'GC_age_fit.jpeg'), **params_svfg)
/opt/conda/lib/python3.6/site-packages/matplotlib/axes/_axes.py:6462: UserWarning: The 'normed' kwarg is deprecated, and has been replaced by the 'density' kwarg. warnings.warn("The 'normed' kwarg is deprecated, and has been "
Assembling BBH database
BBHlib: 87,365 BBHs over the 3,240 GC simulations from GClib
LoGal: 7,909 galaxies with GCs within 30 Mpc
GClib: 662,772 GCs over 7,909 galaxies
=> BBHdata: 17,883,760 BBHs
Assembling BBH database
BBHlib: extracted BBHs from 3,240 GC simulations
LoGal: galaxy within 30 Mpc
GClib: 662,772 GCs over 7,909 Galaxy
assign random GC simulation
GClib model diversity:
=> BBHdata: 17,883,760 BBHs with host galaxy info. + host GC info. + BBH info.
```
for each galaxy in LoGal:
for each GC up to galaxy.Ngc:
assign a random GC simulation from GClib
get GC info. from GClib
get galaxy info. from LoGal
get BBH from BBHlib for the assigned GC simulation
combine
```
BBHdata
host galaxy info. | host GC info. | BBH info. |
---|---|---|
1-5 | 6-7 | 8-14 |
GC, RA, DEC, Dist, VMag | Model, T_GC (13.5Gyr-Age)[Fe/H, f_b, ...] | T_eject, M1, M2, Seperation,Ecc, (BBHlib.Model=GClib.Model) |
if not os.path.isfile(path.join(path_fig,'bbhlib.dat')):
if 'bhlib' not in locals():
## Load BHB LIB
bhlib=pd.DataFrame()
###################################
for i in range(1,325): # model
for j in range(1,11): # model id
###################################
bhe=pd.read_csv('data/BHsystem/%d-%d-bhe.dat' %(i,j),
usecols=[0, 2, 3, 4, 6, 8, 10, 20],
names=['T_eject','Type1','Type2','M1','M2','Seperation','Ecc','Model'],
header=None, delim_whitespace=True)
bhe.Model='%d-%d' %(i,j)
bhlib=pd.concat([bhlib,bhe],ignore_index=False)
# BBH binary
bhsys = bhlib[-(bhlib.Type1==14*(bhlib.Type2==14))].copy(deep=True)
bbhlib = bhlib[bhlib.Type1==14*(bhlib.Type2==14)].copy(deep=True)
bbhlib = bbhlib.drop(bbhlib.columns[[1, 2]], axis=1)
bbhlib.to_csv(path.join(path_fig,'bbhlib.dat'),
index=True, sep=' ', header=True, float_format='%2.6f')
else:
bbhlib = pd.read_csv(path.join(path_fig,'bbhlib.dat'), sep=' ', index_col=0)
display(bbhlib[bbhlib.Model=='187-7'])
display(bbhlib.describe())
T_eject | M1 | M2 | Seperation | Ecc | Model | |
---|---|---|---|---|---|---|
0 | 0.0079 | 26.460 | 11.6190 | 36.1060 | 0.3173 | 187-7 |
1 | 0.0118 | 22.999 | 7.8955 | 222.4900 | 0.0000 | 187-7 |
2 | 0.0157 | 23.190 | 8.1896 | 20.0940 | 0.0978 | 187-7 |
3 | 0.0315 | 23.175 | 8.2358 | 275.1300 | 0.4285 | 187-7 |
4 | 0.0315 | 22.895 | 7.7428 | 91.2530 | 0.8577 | 187-7 |
5 | 0.0315 | 23.874 | 9.4294 | 173.7900 | 0.7756 | 187-7 |
6 | 0.0315 | 25.761 | 10.9290 | 868.3400 | 0.0000 | 187-7 |
7 | 0.0499 | 25.777 | 11.1890 | 793.2600 | 0.3369 | 187-7 |
8 | 0.4144 | 13.386 | 25.7790 | 78.5730 | 0.8045 | 187-7 |
9 | 0.7613 | 26.697 | 26.9370 | 185.3200 | 0.5458 | 187-7 |
10 | 0.7886 | 37.914 | 28.3050 | 331.4400 | 0.3656 | 187-7 |
11 | 0.8329 | 26.815 | 26.1680 | 191.4900 | 0.8724 | 187-7 |
12 | 1.0465 | 23.922 | 24.4750 | 51.2410 | 0.6847 | 187-7 |
13 | 1.2962 | 25.195 | 25.3110 | 217.5200 | 0.5197 | 187-7 |
14 | 1.5873 | 26.058 | 23.4010 | 299.2500 | 0.9600 | 187-7 |
15 | 1.6343 | 26.829 | 26.8670 | 738.9600 | 0.7871 | 187-7 |
16 | 2.1382 | 25.227 | 26.1970 | 203.5100 | 0.6862 | 187-7 |
17 | 2.2051 | 26.931 | 24.2770 | 94.9120 | 0.4887 | 187-7 |
18 | 2.2264 | 26.946 | 24.3750 | 334.5500 | 0.7257 | 187-7 |
19 | 2.2484 | 27.270 | 26.6270 | 317.5200 | 0.7703 | 187-7 |
20 | 2.6205 | 23.248 | 25.5000 | 567.8200 | 0.9503 | 187-7 |
21 | 2.9080 | 21.622 | 24.2440 | 306.3300 | 0.5750 | 187-7 |
22 | 2.9240 | 23.420 | 23.5020 | 623.8000 | 0.3999 | 187-7 |
23 | 2.9433 | 23.193 | 40.5950 | 249.4500 | 0.8743 | 187-7 |
24 | 3.0288 | 23.418 | 22.5560 | 321.2600 | 0.8533 | 187-7 |
25 | 3.1998 | 26.911 | 23.5720 | 706.2900 | 0.6283 | 187-7 |
26 | 3.3023 | 22.814 | 24.3720 | 117.9900 | 0.9995 | 187-7 |
27 | 3.5353 | 22.164 | 20.8490 | 175.1500 | 0.9920 | 187-7 |
28 | 4.0062 | 13.906 | 13.7430 | 9.3148 | 0.8166 | 187-7 |
29 | 4.1225 | 25.177 | 26.7480 | 131.3900 | 0.2675 | 187-7 |
30 | 4.1471 | 25.910 | 52.3300 | 326.6700 | 0.6479 | 187-7 |
31 | 4.2271 | 24.400 | 13.2930 | 12.6890 | 0.6893 | 187-7 |
32 | 4.2399 | 23.670 | 24.8380 | 566.1000 | 0.8899 | 187-7 |
33 | 4.3531 | 20.876 | 28.1520 | 428.8500 | 0.5484 | 187-7 |
34 | 5.4659 | 24.859 | 22.3790 | 292.1900 | 0.9335 | 187-7 |
35 | 5.4829 | 23.489 | 21.2020 | 133.5700 | 0.6346 | 187-7 |
37 | 5.8268 | 24.884 | 49.2470 | 383.3000 | 0.6541 | 187-7 |
38 | 6.0204 | 27.398 | 19.7820 | 188.6200 | 0.4226 | 187-7 |
39 | 6.3426 | 22.117 | 23.3930 | 79.8000 | 0.4630 | 187-7 |
40 | 6.6529 | 27.150 | 11.6090 | 81.0850 | 0.7587 | 187-7 |
41 | 7.0097 | 22.594 | 27.0070 | 120.4400 | 0.4160 | 187-7 |
42 | 7.8008 | 24.080 | 26.8740 | 114.6100 | 0.8300 | 187-7 |
43 | 7.8008 | 24.624 | 25.3410 | 16084.0000 | 0.8807 | 187-7 |
44 | 8.3006 | 13.503 | 18.4360 | 148.1200 | 0.6899 | 187-7 |
45 | 8.3461 | 20.889 | 26.7890 | 189.2700 | 0.4582 | 187-7 |
47 | 9.4491 | 13.858 | 27.8840 | 138.2300 | 0.6108 | 187-7 |
48 | 9.7201 | 22.506 | 22.7170 | 197.7200 | 0.8736 | 187-7 |
49 | 10.4762 | 13.017 | 12.7020 | 175.4100 | 0.8080 | 187-7 |
50 | 10.7644 | 27.078 | 23.0610 | 243.2300 | 0.5025 | 187-7 |
51 | 11.0249 | 11.821 | 13.9440 | 303.7100 | 0.7290 | 187-7 |
53 | 11.7761 | 13.669 | 11.8270 | 233.4800 | 0.8816 | 187-7 |
T_eject | M1 | M2 | Seperation | Ecc | |
---|---|---|---|---|---|
count | 87364.000000 | 87364.000000 | 87364.000000 | 8.736400e+04 | 87364.000000 |
mean | 1.822836 | 24.921859 | 24.182866 | 5.255124e+02 | 0.685132 |
std | 2.452034 | 17.785691 | 19.599724 | 2.390284e+04 | 0.279524 |
min | 0.005200 | 3.000100 | 3.014100 | 0.000000e+00 | 0.000000 |
25% | 0.237200 | 19.756000 | 13.493000 | 4.714150e+01 | 0.518700 |
50% | 0.818000 | 23.190000 | 22.680000 | 9.762050e+01 | 0.765700 |
75% | 2.389600 | 26.121000 | 27.013500 | 2.093825e+02 | 0.917400 |
max | 13.095300 | 2669.500000 | 820.870000 | 5.894200e+06 | 1.000000 |
if not os.path.isfile(path.join(path_fig,'GClib.dat')):
# index of the Galaxy host GCs
idx_host_GC=LoGal.index[LoGal.Ngc>0]
# initialize the GClib
GClib=pd.DataFrame()
# creat GCs in each Galaxy
for row in idx_host_GC:
GClib=GClib.append([LoGal.loc[[row],['RA','Dec','Dist','VMag']]]*int(LoGal.Ngc[row]))
GClib=GClib.reset_index()
GClib=GClib.rename(columns = {'index':'Galaxy'})
# randomly assign model to GClib
random.seed(427)
GClib['Model']=GClib['RA'].apply(lambda x: str(randint(1,325))+'-'+str(randint(1,11)))
# randomly assign Age to GClib
random.seed(427)
GClib['Age']=np.random.choice(GCage_from_kde,size(GCage_from_kde))
# write to file
GClib.to_csv(path.join(path_fig,'GClib.dat'),
index=False,sep=' ', header=False, float_format='%2.6f')
else:
GClib=pd.DataFrame()
GClib=pd.read_csv(path.join(path_fig,'GClib.dat'), sep=' ',header=None,
names=['Galaxy','RA','Dec','Dist','VMag','Model','Age'])
display(GClib.loc[[133], GClib.columns])
display(GClib.describe())
Galaxy | RA | Dec | Dist | VMag | Model | Age | |
---|---|---|---|---|---|---|---|
133 | 133 | 0.05414 | 16.14555 | 13.183 | -20.568214 | 187-7 | 11.070866 |
Galaxy | RA | Dec | Dist | VMag | Age | |
---|---|---|---|---|---|---|
count | 662772.000000 | 662772.000000 | 662772.000000 | 662772.000000 | 662772.000000 | 662772.000000 |
mean | 30603.041664 | 10.959227 | -0.355427 | 21.170577 | -20.534914 | 11.890659 |
std | 27284.737734 | 5.445347 | 38.155540 | 5.926124 | 1.693741 | 0.864954 |
min | 25.000000 | 0.007920 | -89.334590 | 0.010000 | -23.478214 | 8.000000 |
25% | 15238.000000 | 7.451130 | -28.805970 | 17.783000 | -21.698214 | 11.307087 |
50% | 27130.000000 | 11.850490 | -0.480240 | 21.478000 | -20.888214 | 11.968504 |
75% | 35109.000000 | 13.284170 | 25.500910 | 25.351000 | -19.788214 | 12.535433 |
max | 183326.000000 | 23.996680 | 86.131800 | 29.986000 | -11.178214 | 13.480315 |
fig = figure(figsize=fig_size)
ax = fig.add_subplot(111)
sns.distplot(GClib.Age[GClib.Model.str.contains('187-7')],ax=ax)
ax.set_title('3,240 GC simulations are used to represent %d GCs \n within 30 Mpc, out of which, only %d are duplicated GCs' % (LoGal.Ngc.sum(),sum(GClib.duplicated())))
if flag_svfg:
savefig(path.join(path_fig, 'GC_diversity.jpeg'), **params_svfg)
/opt/conda/lib/python3.6/site-packages/matplotlib/axes/_axes.py:6462: UserWarning: The 'normed' kwarg is deprecated, and has been replaced by the 'density' kwarg. warnings.warn("The 'normed' kwarg is deprecated, and has been "
# if not os.path.isfile(path.join(path_fig,'BBHdata.dat.gz')):
# BBHdata=pd.merge(GClib,bbhlib,on='Model')
# BBHdata.to_csv(path.join(path_fig, 'BBHdata.dat.gz'),index=False,sep=' ',
# header=False, float_format='%2.6f',compression='gzip')
# else:
# BBHdata = pd.read_csv(path.join(path_fig,'BBHdata.dat.gz'),sep=' ', header=None,
# names=['PGC','RA','Dec','Dist','VMag','Model','Age',
# 'T_eject','M1','M2','Seperation','Ecc','Period'])
# display(pd.merge(GClib.loc[[133], GClib.columns], bbhlib, on='Model'))
# display(BBHdata.describe())
formulae: Peters (1964)
code: Parallelized from Benacquista (2004)
Separation, Ecc, Period
Merger event rate (6.1.3)
# MC = [28.2, 8.9, 21.1, 7.9, 24.1]
# event = ["GW150914", "GW151226", "GW170104", "GW170608", "GW170814"]
# fig = figure(figsize=fig_size)
# ax = fig.add_subplot(111)
# sns.distplot(insp.MC[insp.MC<50],color='r',ax=ax)
# MC_kde = sns.distplot(merger.MC[merger.MC<50],color='g',ax=ax)
# title('Chirp Mass distribution. \n BBH merged: %d; present BBH: %d; total: %d'
# % (len(merger), len(insp), len(BBHnow)) )
# xlabel('Chirp Mass $M_\odot$')
# legend(['Present-day BBHs','BBH mergers'])
# for i in range(5):
# annotate(event[i],xy=(MC[i],mc2(MC[i])),xytext=(MC[i]-3,mc2(MC[i])+0.01),
# arrowprops=dict(facecolor='yellow', shrink=0.05),
# fontsize=14)
if flag_svfg:
savefig(path.join(path_fig,'MC.jpeg'), **params_svfg)
else:
ax = figure(figsize=sfig_size).add_subplot(111)
ax.imshow(imread(path.join(path_fig, 'MC.jpeg')))
ax.grid(False)
# fig = figure(figsize=fig_size)
# ax = fig.add_subplot(111, projection="mollweide")
# merger['Tcat'] = pd.cut(merger.T_merge, bins=13, labels=linspace(0.1,0.2,13))
# # RA: [0h-24h] -> [-pi,pi]; Dec: [-90°,90°] -> [-pi/2,pi/2]\n"
# for i in merger.Tcat.unique():
# mask = merger.Tcat == i
# ax.scatter((merger.RA[mask]-12)/12*pi,merger.Dec[mask]/180*pi,alpha=i,c='k',
# s=5*(2+np.sqrt(30./merger.Dist[mask])),cmap=cm.gray)
# ax.scatter((11.37165-12)/12*pi,(59.074409)/180*pi,alpha=0.8, c='r')
# bbh11 = pd.read_csv('../data/0220/inspiral.dat', sep=' ')
# ax.scatter((bbh11.RA-12)/12*pi,bbh11.Dec/180*pi,alpha=0.8,c='g')
# ax.set_xticklabels(['2h','4h','6h','8h','10h','12h','14h','16h','18h','20h','22h'])
# ax.grid(True)
if flag_svfg:
savefig(path.join(path_fig, 'merger_dist.jpeg'), **params_svfg)
else:
ax = figure(figsize=sfig_size).add_subplot(111)
ax.imshow(imread(path.join(path_fig, 'merger_dist.jpeg')))
ax.grid(False)
# single example bbh (random)
figure(figsize=(12,8))
plot(tim1[0],tim1[1],'g',alpha=0.3)
plot(tim1[0],tim1[2],'r',alpha=0.3)
plot(tim1[0],tim1[3],'b',alpha=0.3)
xlim([0,tim1[0].max()])
legend(['L1 strain','L2 strain','L3 strain'])
xlabel('Time [s]')
if flag_svfg:
savefig(path.join(path_fig,'response.jpeg'), **params_svfg)
# spectrum
fig = figure(figsize=fig_size)
ax = fig.add_subplot(111)
# random dev bhs
ax.loglog(specd[0],np.sqrt(specd[1]**2+specd[2]**2+specd[3]**2), c='y', alpha=0.3)
# LISA
ax.loglog(lisa.f_gw,lisa.strain,label='LISA Standard Configurations',c='k')
# 11 bbh T < 20,000 s
ax.loglog(spec11[0],np.sqrt(spec11[1]**2+spec11[2]**2+spec11[3]**2), c='g', alpha=0.3)
cax = ax.scatter(pop10d.f, pop10d.h0*spec10d[1].mean()/pop10d.h0.mean(), c=pop10d.e,
s=20,alpha=0.3,cmap='viridis')
# ylabel('Strain')
ax.set_xlabel('Frequency Hz')
colorbar(cax)
ax.set_xlim([10**(-6),10**(-1)])
ax.set_ylim([10**(-26),10**(-16)])
# random bbh
ax.scatter(pop1.f, pop1.h0*abs(spec1[1]).mean()/pop1.h0,c='r')
ax.legend(['BH Systems','LISA Standard',
'BBHs with T < 20k s', 'BBHs with T < 10 d'],loc=4)
ax2 = axes([.48, .57, .25, .32])
ax2.loglog(lisa.f_gw,lisa.strain,label='LISA Standard',c='k')
ax2.loglog(spec11[0],np.sqrt(spec11[1]**2+spec11[2]**2+spec11[3]**2), c='g', alpha=0.3)
ax2.loglog(specd[0],np.sqrt(specd[1]**2+specd[2]**2+specd[3]**2), c='y', alpha=0.3)
ax2.set_xlim([10**(-3.22),10**(-3.2)])
ax2.set_ylim([10**(-26),10**(-20)])
ax2.axis('off')
if flag_svfg:
savefig(path.join(path_fig,'spectrum.jpeg'),**params_svfg)
# # adjusted to 30 Mpc
# def freq(MC, T=2):
# '''
# Input
# MC: chip mass in unit of solar mass
# Output
# frequency in unit of mHz
# '''
# return 19.2 * (MC/28)**(-5/8) *(T/3)**(-3/8)
# def SNR(MC, D, T=2):
# '''
# Input
# MC: chip mass in unit of solar mass
# D: distance in unit of Mpc
# Output
# Signal-to-noise ratio
# '''
# f = freq(MC)
# Sn = np.array([lisa.strain[(lisa.f_gw.values - x).argmin()] for x in f])
# return 6.3*(MC/28)**(5/3)*(T/3)**(1/2)*(30/D)*(Sn/(2.3e-41))**(-1/4)*(f/7)**(2/3)
# def dOmega(MC):
# '''
# Input
# MC: chip mass in unit of solar mass
# Output
# Angular positional error in unit of steradian [sr]
# '''
# f = freq(MC)
# return 3.6e4*(SNR(MC,D)/20)**(-2)*(f/7)**(-2)
# def dV1(MC, D):
# '''
# Input
# MC: chip mass in unit of solar mass
# D: distance in unit of Mpc
# Output
# Signal-to-noise ratio, without correction
# '''
# f = freq(MC)
# return 1.852e3*(D/30)**3*(SNR(MC,D)/20)**(-3)*(f/7)**(-2)
# def dV2(MC, D, sigma=10e3, h=0.7):
# '''
# Input
# MC: chip mass in unit of solar mass
# D: distance in unit of Mpc
# sigma: typical velocity dispersion
# h: constant in Hubble parameter for local universe
# Output
# Signal-to-noise ratio, corrected
# '''
# f = freq(MC)
# return 7.778e2*(D/30)**2*(SNR(MC,D)/20)**(-2)*(f/7)**(-2)*(sigma/10e3)*(0.7/h)
# def NdV(bbh):
# return [dV1(bbh.MC, bbh.Dist), dV2(bbh.MC, bbh.Dist), SNR(bbh.MC,bbh.Dist)]
# dv1,dv2,snr = NdV(bbh10d)
# dv = pd.concat([dv1, dv2], axis=1).max(axis=1)
# dv = pd.concat([dv1, dv2], axis=1).max(axis=1)
# mask = (dv<30) & (dv>0.0005)
# lbbh = bbh10d[mask]
# display()
# fig = plt.figure(figsize=fig_size)
# ax=fig.add_subplot(111, projection="mollweide")
# # RA: [0h-24h] -> [-pi,pi]; Dec: [-90°,90°] -> [-pi/2,pi/2]
# ax.scatter((bbh10d.RA-12)/12*pi,bbh10d.Dec/180*pi,c=dv,
# s=5*(2+log2(30./bbh10d.Dist)),alpha=0.3)
# cax = ax.scatter((lbbh.RA-12)/12*pi,lbbh.Dec/180*pi,c=cm.viridis(snr[mask]*10),edgecolors=None,
# s=100*np.sqrt(dv[mask]),alpha=0.2)
# ax.set_xticklabels(['2h','4h','6h','8h','10h','12h','14h','16h','18h','20h','22h'])
# ax.grid(True)
if flag_svfg:
savefig(path.join(path_fig,"localization.jpeg"), **params_svfg)
else:
ax = figure(figsize=sfig_size).add_subplot(111)
ax.imshow(imread(path.join(path_fig, 'localization.jpeg')))
ax.grid(False)
# # error volume plot
# # BBH statistics
# # sky-locations
# sns.set_style('whitegrid')
# figure(figsize=(12,8))
# ndv = [sum(dv<10**(i/10)) for i in range(50)]
# loglog([10**(i/10) for i in range(50)] , ndv)
# xlabel('Error Volume $\Delta V$ [Mpc$^3$]')
# ylabel('Number of Accurately Localized Binaries')
# xlim([1,27000])
# # location for the zoomed portion
# sub_axes = axes([.62, .2, .25, .25])
# # plot the zoomed portion
# sub_axes.hist(lbbh.Dist)
# sub_axes.set_xlabel('BBH Distance [Mpc]')
# im = plt.imread('Fig_D/localization.jpeg')
# sub_axes2 = axes([.15, .46, .45, .45])
# sub_axes2.imshow(im)
# sub_axes2.axis('off')
if flag_svfg:
savefig(path.join(path_fig, 'numden.jpeg'),**params_svfg)
else:
ax = figure(figsize=sfig_size).add_subplot(111)
ax.imshow(imread(path.join(path_fig, 'numden.jpeg')))
ax.grid(False)
LoGal: galaxy within 30 Mpc
GClib: 662,772 GCs over 7,909 Galaxy
GClib models:
GClib model diversity:
BBHlib: extracted BBHs from 3,240 GC simulations
BBHdata: 17,883,760 BBHs with host galaxy info. + host GC info. + BBH info.
Merger event rate (6.1.3)
Localizationable BBHs: 14
PGC Dist M1 M2 Seperation Ecc
1628 0.787 3.158000 10.250000 0.807000 0.000000
1628 0.787 12.728000 19.122000 0.157281 0.852028
616 2.188 3.158000 10.250000 0.807000 0.000000
1628 0.787 3.158000 10.250000 0.807000 0.000000
1628 0.787 3.158000 10.250000 0.807000 0.000000
1628 0.787 3.158000 10.250000 0.807000 0.000000
1628 0.787 20.632000 137.730000 0.178258 0.477929
1628 0.787 21.424999 25.615999 0.165200 0.750944
1628 0.787 28.433001 26.503000 0.175125 0.813286
1628 0.787 3.158000 10.250000 0.807000 0.000000
131228 0.762 3.158000 10.250000 0.807000 0.000000
1628 0.787 3.158000 10.250000 0.807000 0.000000
1628 0.787 5.319600 10.276000 0.974700 0.000000
1627 0.813 5.319600 10.276000 0.974700 0.000000
contact: contact@domij.info
github repository: LocalBBH
Resume: 3/24/2018 cache