如何调试失败的 cloudera-scm-server 进程?

How do I debug a failing cloudera-scm-server process?

我正在尝试在 centOS6 上安装 Cloudera Manager 5,但 cloudera-scm-server 进程一直失败,日志中没有明确的错误。

service --status-all

cloudera-scm-agent (pid  7058) is running...
cloudera-scm-server dead but pid file exists
pg_ctl: server is running (PID: 13650)
/usr/bin/postgres "-D" "/var/lib/cloudera-scm-server-db/data"

cat /var/log/cloudera-scm-server/cloudera-scm-server.out

JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
Killed (core dumped)

`cat /var/log/cloudera-scm-server/cloudera-scm-server.log

...
2015-06-15 13:54:23,642 INFO main:org.springframework.context.annotation.AnnotationConfigApplicationContext: Refreshing org.springframework.context.annotation.AnnotationConfigApplicationContext@6424e9d8: startup date [Mon Jun 15 13:54:23 UTC 2015]; root of context hierarchy
2015-06-15 13:54:23,682 INFO main:org.springframework.beans.factory.support.DefaultListableBeanFactory: Pre-instantiating singletons in org.springframework.beans.factory.support.DefaultListableBeanFactory@3738baec: defining beans [org.springframework.context.annotation.internalConfigurationAnnotationProcessor,org.springframework.context.annotation.internalAutowiredAnnotationProcessor,org.springframework.context.annotation.internalRequiredAnnotationProcessor,org.springframework.context.annotation.internalCommonAnnotationProcessor,defaultValidatorConfiguration,messageInterpolator,validServiceDependencyValidator,uniqueServiceTypeValidator,uniqueRoleTypeValidator,existingServiceTypeValidator,existingRoleTypeValidator,expressionValidator,autoConfigSharesValidValidator,sdlParser,mdlParser,parcelParser,alternativesParser,permissionsParser,manifestParser,stringInterpolator,serviceDescriptorValidatorWithoutDependencyCheck,serviceDescriptorValidatorWithDependencyCheck,referenceValidator,serviceMonitoringDefinitionsDescriptorValidator,descriptorVisitor,parcelDescriptorValidator,alternativesDescriptorValidator,permissionsDescriptorValidator,manifestDescriptorValidator,springConstraintValidatorFactory,validatorFactoryBean,metricNameFormatValidator,nameForCrossEntityAggregateFormatValidator,builtInServiceTypes,builtInRoleTypes,builtInNamesForCrossEntityAggregateMetrics,uniqueFieldValidator]; root of factory hierarchy
2015-06-15 13:54:48,589 INFO main:com.cloudera.csd.components.MdlRegistry: Loaded /mdls/cdh5/oozie.mdl
2015-06-15 13:54:48,627 INFO main:com.cloudera.cmf.rules.RulesEngine: Loading rules knowledge base

日志的结尾并非 100% 一致,但总的来说我会说这是它经常失败的地方。在 OutOfMemoryError 上,应用程序会像它一样被杀死,但我希望在这种情况下能够在日志中找到错误指示。堆也应该被转储,但我找不到堆转储,机器上的任何地方都没有 *.hprof 文件。由于 cloudera-scm-server.out 日志说了一些关于核心转储的信息,但我也没有找到,我应该在哪里寻找它?

服务器数据库是嵌入式的,运行正确。日志中唯一对我来说可疑的错误消息是关系 'cm_version' 不存在。

问题与内存有关:运行 不是堆 space,而是实际的物理内存。我的虚拟机默认有 512 MB 内存,JVM 配置为有 2 GB 堆 space - 填满物理内存导致 OS 静默终止进程,因此没有有用的日志条目。解决方案是增加虚拟机的内存。