运行 当脚本本身正在导入许多其他模块时,分析器下的脚本,其中一个模块由于已知功能而非常慢
Running a script under a profiler when the script itself is importing many other modules one of which is very slow due to a known function
这是我运行宁
模拟的布局
----main directory
-----output (directory)
-----halo (directory)
-----my_script.py
-----settings_centroid.py
-----simulation (directory)
-----halo_dark (directory)
-----halo_analysis (directory)
-----gizmo (directory)
-----gizmo_plot.py
.
.
.
我的my_script.py
(主目录下运行)是:
.
.
.
from simulation import gizmo
import settings_centroid
settings_centroid.init()
.
.
.
os.system('> output/{}/Info/{}/{}/redshift_{:.3f}/all_subhalo_properties_gas.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.rotation_status, settings_centroid.redshift_z))
.
.
.
gizmo.plot.Image.plot_image(...)
我的 settings_centroid.py
脚本是:
.
.
.
def init():
global ....
.
.
.
我的 gizmo_plot.py
是:
.
.
.
class ImageClass(ut.io.SayClass):
def plot_image():
dimen_label = {0: 'x', 1: 'y', 2: 'z'}
if dimensions_select is None or not len(dimensions_select):
dimensions_select = dimensions_plot
if np.isscalar(distances_max):
distances_max = [distances_max for dimen_i in
range(part[species_name]['position'].shape[1])]
distances_max = np.array(distances_max, dtype=np.float64)
position_limits = []
for dimen_i in range(distances_max.shape[0]):
position_limits.append([-distances_max[dimen_i], distances_max[dimen_i]])
position_limits = np.array(position_limits)
if part_indices is None or not len(part_indices):
part_indices = ut.array.get_arange(part[species_name]['position'].shape[0])
if property_select:
part_indices = ut.catalog.get_indices_catalog(
part[species_name], property_select, part_indices)
if subsample_factor is not None and subsample_factor > 1:
part_indices = part_indices[::subsample_factor]
positions = np.array(part[species_name]['position'][part_indices])
mass_array = np.array(part[species_name]['mass'][part_indices])
velocity_array = np.array(part[species_name]['velocity'][part_indices])
if species_name == 'gas':
HI_fraction_array = np.array(part[species_name]['hydrogen.neutral.fraction'])
weights = None
if weight_name:
weights = part[species_name].prop(weight_name, part_indices)
center_position = ut.particle.parse_property(part, 'center_position', center_position)
if center_position is not None and len(center_position):
# re-orient to input center
positions -= center_position
positions *= part.snapshot['scalefactor']
if rotation is not None:
# rotate image
if rotation is True:
# rotate according to principal axes
if (len(part[species_name].host_rotation_tensors) and
len(part[species_name].host_rotation_tensors[0])):
# rotate to align with stored principal axes
rotation_tensor = part[species_name].host_rotation_tensors[0]
else:
# compute principal axes using all particles originally within image limits
masks = (positions[:, dimensions_select[0]] <= distances_max[0])
for dimen_i in dimensions_select:
masks *= (
(positions[:, dimen_i] >= -distances_max[dimen_i]) *
(positions[:, dimen_i] <= distances_max[dimen_i])
)
rotation_tensor = ut.coordinate.get_principal_axes(
positions[masks], weights[masks])[0]
elif len(rotation):
# use input rotation vectors
rotation_tensor = np.asarray(rotation)
if (np.ndim(rotation_tensor) != 2 or
rotation_tensor.shape[0] != positions.shape[1] or
rotation_tensor.shape[1] != positions.shape[1]):
raise ValueError('wrong shape for rotation = {}'.format(rotation))
else:
raise ValueError('cannot parse rotation = {}'.format(rotation))
positions = ut.coordinate.get_coordinates_rotated(positions, rotation_tensor)
# keep only particles within distance limits and with speeds less than 500km/sec compared to the parent halo.
masks1 = (positions[:, dimensions_select[0]] <= distances_max[0]) #part[species_name]['position'][part_indices]
for dimen_i in dimensions_select:
masks2 = (np.abs(part[species_name]['velocity'][:, dimen_i] - settings_centroid.HCV[dimen_i]) < 500)
masks_part = masks1 * masks2
masks_part *= (
(positions[:, dimen_i] >= -distances_max[dimen_i]) *
(positions[:, dimen_i] <= distances_max[dimen_i])
)
positions = positions[masks_part]
mass_array = mass_array[masks_part]
velocity_array = velocity_array[masks_part]
if species_name == 'gas':
HI_fraction_array = HI_fraction_array[masks_part]
else:
HI_fraction_array = None
if weights is not None:
weights = weights[masks_part]
else:
raise ValueError('need to input center position')
if distance_bin_width is not None and distance_bin_width > 0:
position_bin_number = int(
np.round(2 * np.max(distances_max[dimensions_plot]) / distance_bin_width))
elif distance_bin_number is not None and distance_bin_number > 0:
position_bin_number = 2 * distance_bin_number
else:
raise ValueError('need to input either distance bin width or bin number')
#radiuss_array, positions_array, masss_array = [], [], []
if hal is not None:
# compile halos
if hal_indices is None or not len(hal_indices):
hal_indices = ut.array.get_arange(hal['mass.200m'])
if 0 not in hal_indices:
hal_indices = np.concatenate([[0], hal_indices])
hal_positions = np.array(hal[hal_position_kind][hal_indices])
if center_position is not None and len(center_position):
hal_positions -= center_position
hal_positions *= hal.snapshot['scalefactor']
hal_radiuss = hal[hal_radius_kind][hal_indices]
hal_masss = hal['mass.200m'][hal_indices]
hal_ids = hal['id'][hal_indices]
hal_distances = np.linalg.norm(hal['host.distance'], axis=1)[hal_indices]
hal_star_masss = hal['star.mass'][hal_indices]
hal_star_sizes = hal['star.radius.90'][hal_indices]
# initialize masks
masks = (hal_positions[:, dimensions_select[0]] <= distances_max[0])
for dimen_i in dimensions_select:
masks *= (
(hal_positions[:, dimen_i] >= -distances_max[dimen_i]) *
(hal_positions[:, dimen_i] <= distances_max[dimen_i])
)
hal_radiuss = hal_radiuss[masks]
hal_positions = hal_positions[masks]
hal_masss = hal_masss[masks]
hal_ids = hal_ids[masks]
hal_distances = hal_distances[masks]
hal_star_masss = hal_star_masss[masks]
hal_star_sizes = hal_star_sizes[masks]
halo_ids = 'output/{}/Info/{}/halo_catalog_{:.3f}/halo_ids.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z)
halo_masses = 'output/{}/Info/{}/halo_catalog_{:.3f}/halo_masses.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z)
halo_radii = 'output/{}/Info/{}/halo_catalog_{:.3f}/halo_radii.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z)
halo_positions = 'output/{}/Info/{}/halo_catalog_{:.3f}/halo_positions.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z)
halo_distances = 'output/{}/Info/{}/halo_catalog_{:.3f}/halo_distances.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z)
halo_star_masss = 'output/{}/Info/{}/halo_catalog_{:.3f}/halo_star_masses.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z)
halo_star_sizes = 'output/{}/Info/{}/halo_catalog_{:.3f}/halo_star_sizes.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z)
np.savetxt(halo_ids, hal_ids, fmt='%.0f')
np.savetxt(halo_masses, hal_masss, fmt='%.3e')
np.savetxt(halo_radii, hal_radiuss, fmt='%.3e')
np.savetxt(halo_positions, hal_positions, fmt='%.3e')
np.savetxt(halo_distances, hal_distances, fmt='%.3e')
np.savetxt(halo_star_masss, hal_star_masss, fmt='%.3e')
np.savetxt(halo_star_sizes, hal_star_sizes, fmt='%.3e')
def get_histogram(...):
if '3d' in image_kind:
# calculate maximum local density along projected dimension
hist_valuess, (hist_xs, hist_ys, hist_zs) = np.histogramdd(positions, position_bin_number, position_limits, weights=weights, normed=False,)
# convert to 3-d density
hist_valuess /= (np.diff(hist_xs)[0] * np.diff(hist_ys)[0] * np.diff(hist_zs)[0])
else:
# project along single dimension
hist_valuess, hist_xs, hist_ys = np.histogram2d(positions[:, dimensions_plot[0]], positions[:, dimensions_plot[1]], position_bin_number, position_limits[dimensions_plot], weights=weights, normed=False,)
# convert to surface density
hist_valuess /= np.diff(hist_xs)[0] * np.diff(hist_ys)[0]
# convert to number density
if use_column_units:
hist_valuess *= ut.basic.constant.hydrogen_per_sun * ut.basic.constant.kpc_per_cm ** 2
lls_number = np.sum((hist_valuess > 1e17) * (hist_valuess < 2e20))
dla_number = np.sum(hist_valuess > 2e20)
LLS, DLA = lls_number, dla_number
self.say('Number of grids: LLS = {:.0f}, \t DLA = {:.0f}'.format(lls_number, dla_number))
# Counting absorber grid number in each subhalo
if return_halo_info:
subhalos_gas = 'output/{}/Info/{}/{}/redshift_{:.3f}/all_subhalo_properties_gas.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.rotation_status, settings_centroid.redshift_z)
hal_positions_data = np.loadtxt(r'output/{}/Info/{}/halo_catalog_{:.3f}/halo_positions.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z))
hal_radiuss_data = np.loadtxt(r'output/{}/Info/{}/halo_catalog_{:.3f}/halo_radii.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z))
hal_masss_data = np.loadtxt(r'output/{}/Info/{}/halo_catalog_{:.3f}/halo_masses.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z))
hal_ids_data = np.loadtxt(r'output/{}/Info/{}/halo_catalog_{:.3f}/halo_ids.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z))
hal_distances_data = np.loadtxt(r'output/{}/Info/{}/halo_catalog_{:.3f}/halo_distances.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z))
totals_gas = {}
sub_circle_catalog_gas = []
enclosing_circles_gas = {}
for hal_id, hal_position, hal_radius, hal_mass, hal_distance in zip(hal_ids_data, hal_positions_data, hal_radiuss_data, hal_masss_data, hal_distances_data):
if ((hal_distance <= settings_centroid.distance_max) and (log10(hal_mass) >= settings_centroid.low_mass_cutoff)):
hal_gas_mass = sum(settings_centroid.part_HI_mass[i]*settings_centroid.part_HI_fraction[i] for i in np.where(settings_centroid.part_HI_fraction > 0)[0] if (np.linalg.norm(settings_centroid.part_HI_position[i] - hal_position) * settings_centroid.scale_factor <= hal_radius))
if (hal_gas_mass/hal_mass > 1.0e-8):
enclosing_circles_gas[hal_id] = float(settings_centroid.trunc_digits(log10(hal_mass), 4))
# choose all subhalos' IDs enclosing the DLA pixel
enclosing_circles = list(enclosing_circles_gas.keys())
sub_circle_catalog_gas += [(enclosing_circles_gas[i], 1) for i in enclosing_circles]
# add up all special grids in each sub-circle when looping over all grids
for key, value in sub_circle_catalog_gas:
totals_gas[key] = totals_gas.get(key, 0) + value
totals_gas = collections.OrderedDict(sorted(totals_gas.items()))
totals_gas = list(totals_gas.items())
with open(subhalos_gas, "a") as smallest_local_subhalos:
print('{}'.format(totals_gas), file=smallest_local_subhalos)
smallest_local_subhalos.close()
os.system('> output/{}/Info/{}/halo_catalog_{:.3f}/halo_positions.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z))
os.system('> output/{}/Info/{}/halo_catalog_{:.3f}/halo_radii.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z))
os.system('> output/{}/Info/{}/halo_catalog_{:.3f}/halo_masses.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z))
os.system('> output/{}/Info/{}/halo_catalog_{:.3f}/halo_ids.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z))
os.system('> output/{}/Info/{}/halo_catalog_{:.3f}/halo_distances.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z))
masks = (hist_valuess > 0)
self.say('histogram min, med, max = {:.3e}, {:.3e}, {:.3e}'.format(hist_valuess[masks].min(), np.median(hist_valuess[masks]), hist_valuess[masks].max()))
hist_limits = np.array([hist_valuess[masks].min(), hist_valuess[masks].max()])
return hist_valuess, hist_xs, hist_ys, hist_limits
.
.
.
似乎在 运行宁 my_script.py
的同时,要花很长时间才能产生结果。通过反复试验,似乎缓慢的部分发生在 if...if 嵌套循环内 gizmo_plot.py
模块内名为 get_histogram()
的函数下。但是,我需要在分析器下 运行 my_script.py
才能准确找到慢行。你能帮我如何通过在不同的文件中输出配置文件来完成吗?特别是如何准确定位函数的慢行?
下面 Wilx 建议的程序的输出是:
my_script.prof% sort cumulative
my_script.prof% stats 5
Mon Nov 4 14:32:25 2019 my_script.prof
76741270081 function calls (76741240862 primitive calls) in 107707.564 seconds
Ordered by: cumulative time
List reduced from 4432 to 5 due to restriction <5>
ncalls tottime percall cumtime percall filename:lineno(function)
814/1 0.373 0.000 107707.595 107707.595 {built-in method builtins.exec}
1 0.348 0.348 107707.331 107707.331 my_script.py:1(<module>)
2 312.439 156.219 105178.281 52589.140 gizmo_plot.py:220(plot_image)
2 5.966 2.983 104209.307 52104.654 gizmo_plot.py:658(get_histogram)
69 0.001 0.000 104168.283 1509.685 {built-in method builtins.sum}
第二次检查:
my_script.prof% sort time
my_script.prof% stats 10
Mon Nov 4 14:32:25 2019 my_script.prof
76741270081 function calls (76741240862 primitive calls) in 107707.564 seconds
Ordered by: internal time
List reduced from 4432 to 10 due to restriction <10>
ncalls tottime percall cumtime percall filename:lineno(function)
9592513999 41813.829 0.000 80076.700 0.000 linalg.py:2203(norm)
69 24091.619 349.154 104168.282 1509.685 gizmo_plot.py:726(<genexpr>)
9592514405 9783.770 0.000 9783.770 0.000 {built-in method numpy.core.multiarray.dot}
9592514634 8158.522 0.000 11181.488 0.000 numeric.py:433(asarray)
9592514062 7065.503 0.000 7065.503 0.000 {method 'ravel' of 'numpy.ndarray' objects}
9592513998 5393.512 0.000 7708.295 0.000 linalg.py:113(isComplexType)
19185030173/19185030020 4839.445 0.000 4839.831 0.000 {built-in method builtins.issubclass}
9592517050 3069.880 0.000 3072.494 0.000 {built-in method numpy.core.multiarray.array}
11/3 605.531 55.048 605.532 201.844 gizmo_io.py:190(prop)
376 530.257 1.410 530.655 1.411 dataset.py:634(read_direct)
运行 分析器喜欢 python3 -m cProfile -o my_script.prof my_script.py
。这应该会在您的脚本完成后创建一个 my_script.prof
。然后,您可以使用 python3 -m pstats my_script.prof
.
加载该 .prof
文件
这是我运行宁
模拟的布局----main directory
-----output (directory)
-----halo (directory)
-----my_script.py
-----settings_centroid.py
-----simulation (directory)
-----halo_dark (directory)
-----halo_analysis (directory)
-----gizmo (directory)
-----gizmo_plot.py
.
.
.
我的my_script.py
(主目录下运行)是:
.
.
.
from simulation import gizmo
import settings_centroid
settings_centroid.init()
.
.
.
os.system('> output/{}/Info/{}/{}/redshift_{:.3f}/all_subhalo_properties_gas.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.rotation_status, settings_centroid.redshift_z))
.
.
.
gizmo.plot.Image.plot_image(...)
我的 settings_centroid.py
脚本是:
.
.
.
def init():
global ....
.
.
.
我的 gizmo_plot.py
是:
.
.
.
class ImageClass(ut.io.SayClass):
def plot_image():
dimen_label = {0: 'x', 1: 'y', 2: 'z'}
if dimensions_select is None or not len(dimensions_select):
dimensions_select = dimensions_plot
if np.isscalar(distances_max):
distances_max = [distances_max for dimen_i in
range(part[species_name]['position'].shape[1])]
distances_max = np.array(distances_max, dtype=np.float64)
position_limits = []
for dimen_i in range(distances_max.shape[0]):
position_limits.append([-distances_max[dimen_i], distances_max[dimen_i]])
position_limits = np.array(position_limits)
if part_indices is None or not len(part_indices):
part_indices = ut.array.get_arange(part[species_name]['position'].shape[0])
if property_select:
part_indices = ut.catalog.get_indices_catalog(
part[species_name], property_select, part_indices)
if subsample_factor is not None and subsample_factor > 1:
part_indices = part_indices[::subsample_factor]
positions = np.array(part[species_name]['position'][part_indices])
mass_array = np.array(part[species_name]['mass'][part_indices])
velocity_array = np.array(part[species_name]['velocity'][part_indices])
if species_name == 'gas':
HI_fraction_array = np.array(part[species_name]['hydrogen.neutral.fraction'])
weights = None
if weight_name:
weights = part[species_name].prop(weight_name, part_indices)
center_position = ut.particle.parse_property(part, 'center_position', center_position)
if center_position is not None and len(center_position):
# re-orient to input center
positions -= center_position
positions *= part.snapshot['scalefactor']
if rotation is not None:
# rotate image
if rotation is True:
# rotate according to principal axes
if (len(part[species_name].host_rotation_tensors) and
len(part[species_name].host_rotation_tensors[0])):
# rotate to align with stored principal axes
rotation_tensor = part[species_name].host_rotation_tensors[0]
else:
# compute principal axes using all particles originally within image limits
masks = (positions[:, dimensions_select[0]] <= distances_max[0])
for dimen_i in dimensions_select:
masks *= (
(positions[:, dimen_i] >= -distances_max[dimen_i]) *
(positions[:, dimen_i] <= distances_max[dimen_i])
)
rotation_tensor = ut.coordinate.get_principal_axes(
positions[masks], weights[masks])[0]
elif len(rotation):
# use input rotation vectors
rotation_tensor = np.asarray(rotation)
if (np.ndim(rotation_tensor) != 2 or
rotation_tensor.shape[0] != positions.shape[1] or
rotation_tensor.shape[1] != positions.shape[1]):
raise ValueError('wrong shape for rotation = {}'.format(rotation))
else:
raise ValueError('cannot parse rotation = {}'.format(rotation))
positions = ut.coordinate.get_coordinates_rotated(positions, rotation_tensor)
# keep only particles within distance limits and with speeds less than 500km/sec compared to the parent halo.
masks1 = (positions[:, dimensions_select[0]] <= distances_max[0]) #part[species_name]['position'][part_indices]
for dimen_i in dimensions_select:
masks2 = (np.abs(part[species_name]['velocity'][:, dimen_i] - settings_centroid.HCV[dimen_i]) < 500)
masks_part = masks1 * masks2
masks_part *= (
(positions[:, dimen_i] >= -distances_max[dimen_i]) *
(positions[:, dimen_i] <= distances_max[dimen_i])
)
positions = positions[masks_part]
mass_array = mass_array[masks_part]
velocity_array = velocity_array[masks_part]
if species_name == 'gas':
HI_fraction_array = HI_fraction_array[masks_part]
else:
HI_fraction_array = None
if weights is not None:
weights = weights[masks_part]
else:
raise ValueError('need to input center position')
if distance_bin_width is not None and distance_bin_width > 0:
position_bin_number = int(
np.round(2 * np.max(distances_max[dimensions_plot]) / distance_bin_width))
elif distance_bin_number is not None and distance_bin_number > 0:
position_bin_number = 2 * distance_bin_number
else:
raise ValueError('need to input either distance bin width or bin number')
#radiuss_array, positions_array, masss_array = [], [], []
if hal is not None:
# compile halos
if hal_indices is None or not len(hal_indices):
hal_indices = ut.array.get_arange(hal['mass.200m'])
if 0 not in hal_indices:
hal_indices = np.concatenate([[0], hal_indices])
hal_positions = np.array(hal[hal_position_kind][hal_indices])
if center_position is not None and len(center_position):
hal_positions -= center_position
hal_positions *= hal.snapshot['scalefactor']
hal_radiuss = hal[hal_radius_kind][hal_indices]
hal_masss = hal['mass.200m'][hal_indices]
hal_ids = hal['id'][hal_indices]
hal_distances = np.linalg.norm(hal['host.distance'], axis=1)[hal_indices]
hal_star_masss = hal['star.mass'][hal_indices]
hal_star_sizes = hal['star.radius.90'][hal_indices]
# initialize masks
masks = (hal_positions[:, dimensions_select[0]] <= distances_max[0])
for dimen_i in dimensions_select:
masks *= (
(hal_positions[:, dimen_i] >= -distances_max[dimen_i]) *
(hal_positions[:, dimen_i] <= distances_max[dimen_i])
)
hal_radiuss = hal_radiuss[masks]
hal_positions = hal_positions[masks]
hal_masss = hal_masss[masks]
hal_ids = hal_ids[masks]
hal_distances = hal_distances[masks]
hal_star_masss = hal_star_masss[masks]
hal_star_sizes = hal_star_sizes[masks]
halo_ids = 'output/{}/Info/{}/halo_catalog_{:.3f}/halo_ids.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z)
halo_masses = 'output/{}/Info/{}/halo_catalog_{:.3f}/halo_masses.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z)
halo_radii = 'output/{}/Info/{}/halo_catalog_{:.3f}/halo_radii.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z)
halo_positions = 'output/{}/Info/{}/halo_catalog_{:.3f}/halo_positions.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z)
halo_distances = 'output/{}/Info/{}/halo_catalog_{:.3f}/halo_distances.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z)
halo_star_masss = 'output/{}/Info/{}/halo_catalog_{:.3f}/halo_star_masses.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z)
halo_star_sizes = 'output/{}/Info/{}/halo_catalog_{:.3f}/halo_star_sizes.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z)
np.savetxt(halo_ids, hal_ids, fmt='%.0f')
np.savetxt(halo_masses, hal_masss, fmt='%.3e')
np.savetxt(halo_radii, hal_radiuss, fmt='%.3e')
np.savetxt(halo_positions, hal_positions, fmt='%.3e')
np.savetxt(halo_distances, hal_distances, fmt='%.3e')
np.savetxt(halo_star_masss, hal_star_masss, fmt='%.3e')
np.savetxt(halo_star_sizes, hal_star_sizes, fmt='%.3e')
def get_histogram(...):
if '3d' in image_kind:
# calculate maximum local density along projected dimension
hist_valuess, (hist_xs, hist_ys, hist_zs) = np.histogramdd(positions, position_bin_number, position_limits, weights=weights, normed=False,)
# convert to 3-d density
hist_valuess /= (np.diff(hist_xs)[0] * np.diff(hist_ys)[0] * np.diff(hist_zs)[0])
else:
# project along single dimension
hist_valuess, hist_xs, hist_ys = np.histogram2d(positions[:, dimensions_plot[0]], positions[:, dimensions_plot[1]], position_bin_number, position_limits[dimensions_plot], weights=weights, normed=False,)
# convert to surface density
hist_valuess /= np.diff(hist_xs)[0] * np.diff(hist_ys)[0]
# convert to number density
if use_column_units:
hist_valuess *= ut.basic.constant.hydrogen_per_sun * ut.basic.constant.kpc_per_cm ** 2
lls_number = np.sum((hist_valuess > 1e17) * (hist_valuess < 2e20))
dla_number = np.sum(hist_valuess > 2e20)
LLS, DLA = lls_number, dla_number
self.say('Number of grids: LLS = {:.0f}, \t DLA = {:.0f}'.format(lls_number, dla_number))
# Counting absorber grid number in each subhalo
if return_halo_info:
subhalos_gas = 'output/{}/Info/{}/{}/redshift_{:.3f}/all_subhalo_properties_gas.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.rotation_status, settings_centroid.redshift_z)
hal_positions_data = np.loadtxt(r'output/{}/Info/{}/halo_catalog_{:.3f}/halo_positions.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z))
hal_radiuss_data = np.loadtxt(r'output/{}/Info/{}/halo_catalog_{:.3f}/halo_radii.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z))
hal_masss_data = np.loadtxt(r'output/{}/Info/{}/halo_catalog_{:.3f}/halo_masses.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z))
hal_ids_data = np.loadtxt(r'output/{}/Info/{}/halo_catalog_{:.3f}/halo_ids.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z))
hal_distances_data = np.loadtxt(r'output/{}/Info/{}/halo_catalog_{:.3f}/halo_distances.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z))
totals_gas = {}
sub_circle_catalog_gas = []
enclosing_circles_gas = {}
for hal_id, hal_position, hal_radius, hal_mass, hal_distance in zip(hal_ids_data, hal_positions_data, hal_radiuss_data, hal_masss_data, hal_distances_data):
if ((hal_distance <= settings_centroid.distance_max) and (log10(hal_mass) >= settings_centroid.low_mass_cutoff)):
hal_gas_mass = sum(settings_centroid.part_HI_mass[i]*settings_centroid.part_HI_fraction[i] for i in np.where(settings_centroid.part_HI_fraction > 0)[0] if (np.linalg.norm(settings_centroid.part_HI_position[i] - hal_position) * settings_centroid.scale_factor <= hal_radius))
if (hal_gas_mass/hal_mass > 1.0e-8):
enclosing_circles_gas[hal_id] = float(settings_centroid.trunc_digits(log10(hal_mass), 4))
# choose all subhalos' IDs enclosing the DLA pixel
enclosing_circles = list(enclosing_circles_gas.keys())
sub_circle_catalog_gas += [(enclosing_circles_gas[i], 1) for i in enclosing_circles]
# add up all special grids in each sub-circle when looping over all grids
for key, value in sub_circle_catalog_gas:
totals_gas[key] = totals_gas.get(key, 0) + value
totals_gas = collections.OrderedDict(sorted(totals_gas.items()))
totals_gas = list(totals_gas.items())
with open(subhalos_gas, "a") as smallest_local_subhalos:
print('{}'.format(totals_gas), file=smallest_local_subhalos)
smallest_local_subhalos.close()
os.system('> output/{}/Info/{}/halo_catalog_{:.3f}/halo_positions.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z))
os.system('> output/{}/Info/{}/halo_catalog_{:.3f}/halo_radii.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z))
os.system('> output/{}/Info/{}/halo_catalog_{:.3f}/halo_masses.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z))
os.system('> output/{}/Info/{}/halo_catalog_{:.3f}/halo_ids.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z))
os.system('> output/{}/Info/{}/halo_catalog_{:.3f}/halo_distances.txt'.format(settings_centroid.halo_size, settings_centroid.halo_name, settings_centroid.redshift_z))
masks = (hist_valuess > 0)
self.say('histogram min, med, max = {:.3e}, {:.3e}, {:.3e}'.format(hist_valuess[masks].min(), np.median(hist_valuess[masks]), hist_valuess[masks].max()))
hist_limits = np.array([hist_valuess[masks].min(), hist_valuess[masks].max()])
return hist_valuess, hist_xs, hist_ys, hist_limits
.
.
.
似乎在 运行宁 my_script.py
的同时,要花很长时间才能产生结果。通过反复试验,似乎缓慢的部分发生在 if...if 嵌套循环内 gizmo_plot.py
模块内名为 get_histogram()
的函数下。但是,我需要在分析器下 运行 my_script.py
才能准确找到慢行。你能帮我如何通过在不同的文件中输出配置文件来完成吗?特别是如何准确定位函数的慢行?
下面 Wilx 建议的程序的输出是:
my_script.prof% sort cumulative
my_script.prof% stats 5
Mon Nov 4 14:32:25 2019 my_script.prof
76741270081 function calls (76741240862 primitive calls) in 107707.564 seconds
Ordered by: cumulative time
List reduced from 4432 to 5 due to restriction <5>
ncalls tottime percall cumtime percall filename:lineno(function)
814/1 0.373 0.000 107707.595 107707.595 {built-in method builtins.exec}
1 0.348 0.348 107707.331 107707.331 my_script.py:1(<module>)
2 312.439 156.219 105178.281 52589.140 gizmo_plot.py:220(plot_image)
2 5.966 2.983 104209.307 52104.654 gizmo_plot.py:658(get_histogram)
69 0.001 0.000 104168.283 1509.685 {built-in method builtins.sum}
第二次检查:
my_script.prof% sort time
my_script.prof% stats 10
Mon Nov 4 14:32:25 2019 my_script.prof
76741270081 function calls (76741240862 primitive calls) in 107707.564 seconds
Ordered by: internal time
List reduced from 4432 to 10 due to restriction <10>
ncalls tottime percall cumtime percall filename:lineno(function)
9592513999 41813.829 0.000 80076.700 0.000 linalg.py:2203(norm)
69 24091.619 349.154 104168.282 1509.685 gizmo_plot.py:726(<genexpr>)
9592514405 9783.770 0.000 9783.770 0.000 {built-in method numpy.core.multiarray.dot}
9592514634 8158.522 0.000 11181.488 0.000 numeric.py:433(asarray)
9592514062 7065.503 0.000 7065.503 0.000 {method 'ravel' of 'numpy.ndarray' objects}
9592513998 5393.512 0.000 7708.295 0.000 linalg.py:113(isComplexType)
19185030173/19185030020 4839.445 0.000 4839.831 0.000 {built-in method builtins.issubclass}
9592517050 3069.880 0.000 3072.494 0.000 {built-in method numpy.core.multiarray.array}
11/3 605.531 55.048 605.532 201.844 gizmo_io.py:190(prop)
376 530.257 1.410 530.655 1.411 dataset.py:634(read_direct)
运行 分析器喜欢 python3 -m cProfile -o my_script.prof my_script.py
。这应该会在您的脚本完成后创建一个 my_script.prof
。然后,您可以使用 python3 -m pstats my_script.prof
.
.prof
文件