通过将 Apache Mahout 与 HDInsight 结合使用,在 运行 电影推荐中出错
Error in running movie recommendations by using Apache Mahout with HDInsight
我运行下面的代码但是收到一个错误...
# The HDInsight cluster name.
$clusterName = "my-cluster-name"
Use-AzureHDInsightCluster $clusterName
# NOTE: The version number portion of the file path
# may change in future versions of HDInsight.
# So dynamically grab it using Hive.
$mahoutPath = Invoke-Hive -Query '!${env:COMSPEC} /c dir /b /s ${env:MAHOUT_HOME}\examples\target\*-job.jar' | where {$_.startswith("C:\apps\dist")}
$mahoutPath = $mahoutPath -replace "\", "/"
$jarFile = "file:///$mahoutPath"
#
# If you are using an earlier version of HDInsight,
# set $jarFile to the jar file you
# uploaded.
# For example,
# $jarFile = "wasb:///example/jars/mahout-core-0.9-job.jar"
# The arguments for this job
# * input - the path to the data uploaded to HDInsight
# * output - the path to store output data
# * tempDir - the directory for temp files
$jobArguments = "-s", "SIMILARITY_COOCCURRENCE",
"--input", "wasb:///u.data",
"--output", "wasb:///example/out",
"--tempDir", "wasb:///temp/mahout"
# Create the job definition
$jobDefinition = New-AzureHDInsightMapReduceJobDefinition `
-JarFile $jarFile `
-ClassName "org.apache.mahout.cf.taste.hadoop.item.RecommenderJob" `
-Arguments $jobArguments
# Start the job
$job = Start-AzureHDInsightJob -Cluster $clusterName -JobDefinition $jobDefinition
# Wait on the job to complete
Write-Host "Wait for the job to complete ..." -ForegroundColor Green
Wait-AzureHDInsightJob -Job $job
# Write out any error information
Write-Host "STDERR"
Get-AzureHDInsightJobOutput -Cluster $clusterName -JobId $job.JobId -StandardError
我已经使用 azure 存储资源管理器将 u.data 文件上传到包含 hdinsight 文件的容器的根目录中..
我在第 .. 行收到错误
PS C:> $job = Start-AzureHDInsightJob -Cluster $clusterName -JobDefinition $jobDefinition
错误:
Start-AzureHDInsightJob:请求失败 code:InternalServerError
内容:{"error":null}
在 line:1 char:8
+ $job = Start-AzureHDInsightJob -Cluster $clusterName -JobDefinition $jobDefiniti ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [Start-AzureHDInsightJob], HttpLayerException
+ FullyQualifiedErrorId : Microsoft.WindowsAzure.Management.HDInsight.Framework.Core.Library.WebRequest.HttpLayerE
xception,Microsoft.WindowsAzure.Management.HDInsight.Cmdlet.PSCmdlets.StartAzureHDInsightJobCmdlet
真诚感谢任何帮助..
谢谢
这看起来像是最近对 HDInsight 群集上的 Hive/Templeton 所做的更改,现在 returns 文件路径末尾的 CRLF。我将脚本更改为以下内容以修复它:
$mahoutPath = Invoke-Hive -Query '!${env:COMSPEC} /c dir /b /s ${env:MAHOUT_HOME}\examples\target\*-job.jar' | where {$_.startswith("C:\apps\dist")}
$noCRLF = $mahoutPath -replace "`r`n", ""
$cleanedPath = $noCRLF -replace "\", "/"
$jarFile = "file:///$cleanedPath"
我运行下面的代码但是收到一个错误...
# The HDInsight cluster name.
$clusterName = "my-cluster-name"
Use-AzureHDInsightCluster $clusterName
# NOTE: The version number portion of the file path
# may change in future versions of HDInsight.
# So dynamically grab it using Hive.
$mahoutPath = Invoke-Hive -Query '!${env:COMSPEC} /c dir /b /s ${env:MAHOUT_HOME}\examples\target\*-job.jar' | where {$_.startswith("C:\apps\dist")}
$mahoutPath = $mahoutPath -replace "\", "/"
$jarFile = "file:///$mahoutPath"
#
# If you are using an earlier version of HDInsight,
# set $jarFile to the jar file you
# uploaded.
# For example,
# $jarFile = "wasb:///example/jars/mahout-core-0.9-job.jar"
# The arguments for this job
# * input - the path to the data uploaded to HDInsight
# * output - the path to store output data
# * tempDir - the directory for temp files
$jobArguments = "-s", "SIMILARITY_COOCCURRENCE",
"--input", "wasb:///u.data",
"--output", "wasb:///example/out",
"--tempDir", "wasb:///temp/mahout"
# Create the job definition
$jobDefinition = New-AzureHDInsightMapReduceJobDefinition `
-JarFile $jarFile `
-ClassName "org.apache.mahout.cf.taste.hadoop.item.RecommenderJob" `
-Arguments $jobArguments
# Start the job
$job = Start-AzureHDInsightJob -Cluster $clusterName -JobDefinition $jobDefinition
# Wait on the job to complete
Write-Host "Wait for the job to complete ..." -ForegroundColor Green
Wait-AzureHDInsightJob -Job $job
# Write out any error information
Write-Host "STDERR"
Get-AzureHDInsightJobOutput -Cluster $clusterName -JobId $job.JobId -StandardError
我已经使用 azure 存储资源管理器将 u.data 文件上传到包含 hdinsight 文件的容器的根目录中..
我在第 .. 行收到错误
PS C:> $job = Start-AzureHDInsightJob -Cluster $clusterName -JobDefinition $jobDefinition
错误:
Start-AzureHDInsightJob:请求失败 code:InternalServerError 内容:{"error":null} 在 line:1 char:8 + $job = Start-AzureHDInsightJob -Cluster $clusterName -JobDefinition $jobDefiniti ... + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~ + CategoryInfo : NotSpecified: (:) [Start-AzureHDInsightJob], HttpLayerException + FullyQualifiedErrorId : Microsoft.WindowsAzure.Management.HDInsight.Framework.Core.Library.WebRequest.HttpLayerE xception,Microsoft.WindowsAzure.Management.HDInsight.Cmdlet.PSCmdlets.StartAzureHDInsightJobCmdlet
真诚感谢任何帮助..
谢谢
这看起来像是最近对 HDInsight 群集上的 Hive/Templeton 所做的更改,现在 returns 文件路径末尾的 CRLF。我将脚本更改为以下内容以修复它:
$mahoutPath = Invoke-Hive -Query '!${env:COMSPEC} /c dir /b /s ${env:MAHOUT_HOME}\examples\target\*-job.jar' | where {$_.startswith("C:\apps\dist")}
$noCRLF = $mahoutPath -replace "`r`n", ""
$cleanedPath = $noCRLF -replace "\", "/"
$jarFile = "file:///$cleanedPath"