有没有办法将 .ods 和 .pdf 文件的文本复制到 libreoffice .odt 文件中?
Is there a way to copy the text of an .ods and .pdf file into a libreoffice .odt file?
我正在尝试创建一个 libreoffice 基本宏,它允许您将文件的全部内容复制到 table 行中。下面的代码可以正确处理文本文件,例如 .ods 或 .txt,但在处理 .pdf 和 .ods 文件时会出现一些问题。特别是它在 getText() 方法上崩溃。
您知道我可以使用什么其他方法来解决我的问题吗?
`
REM ***The file Url***
sUrlDoc = "file:///C:/Users/user/Desktop/Test.ods"
REM ***It correctly opens the file***
oDoc = StarDesktop.loadComponentFromURL(sUrlDoc, "_blank", 0, Prop() )
REM ***Correctly inserts a new row in the table***
oTable.Rows.insertByIndex(oTable.getRows().getCount(),1)
REM ***It goes into the right position***
oCell = oTable.getCellByPosition(0,1)
REM ***Should read from file (only works with .odt and .txt)***
oCursor = oDoc.getText(1)
oCell.setString(oCursor.string)
oDoc.close(true)`
您可以通过多种方式获取 ODS 文件的上下文。
其中最慢的是将工作簿sheet中的所有数据逐个单元格地sheet迭代,取出每个单元格的文本内容。
我建议使用 Andrew Pitonyak shows in chapter 5.23. Manipulating the clipboard 的方法(将这本书放在手边,您将不必编写许多宏来解决日常任务 - 您只需 ready-made代码)
Function getContentODS(sDocName As String) As String
Dim oDoc As Variant ' Spreadsheet as object
Dim bDisposable As Boolean ' Can be closed
Dim oSheets As Variant ' All sheets of oDoc
Dim oSheet As Variant ' Single sheet
Dim i As Long
Dim oCurrentController As Variant
Dim oCursor As Variant ' Get Used Area
Dim oTransferable As Variant ' Content of selection
Dim oTransferDataFlavors As Variant
Dim oConverter As Variant ' Util
Dim j As Integer, iTextLocation As Integer
Dim oData As Variant
Dim sResult As String ' All content as very long string
GlobalScope.BasicLibraries.loadLibrary("Tools")
If Not FileExists(sDocName) Then Exit Function
oDoc = OpenDocument(ConvertToURL(sDocName), Array(), bDisposable)
sResult = FileNameoutofPath(sDocName) & ": "
oCurrentController = oDoc.getCurrentController()
oSheets = oDoc.getSheets()
oConverter = createUnoService("com.sun.star.script.Converter")
For i = 0 to oSheets.getCount()-1
oSheet = oSheets.getByIndex(i)
oCursor = oSheet.createCursor()
oCursor.gotoEndOfUsedArea(True)
oCurrentController.select(oCursor)
oTransferable = oCurrentController.getTransferable()
oTransferDataFlavors = oTransferable.getTransferDataFlavors()
iTextLocation = -1
For j = LBound(oTransferDataFlavors) To UBound(oTransferDataFlavors)
If oTransferDataFlavors(j).MimeType = "text/plain;charset=utf-16" Then
iTextLocation = j
Exit For
End If
Next
If (iTextLocation >= 0) Then
oData = oTransferable.getTransferData(oTransferDataFlavors(iTextLocation))
sResult = sResult & oSheet.getName() & "=" & _
oConverter.convertToSimpleType(oData, com.sun.star.uno.TypeClass.STRING) & "; "
End If
Next i
If bDisposable Then oDoc.close(True)
getContentODS = sResult
End Function
这个函数会打开spreadsheet,它会在参数中接收到它的路径和名字,遍历所有的sheet,取出文本内容拼接成一个长字符串变量,最后关闭文档。
您可以使用以下过程测试此代码:
Sub tst
MsgBox getContentODS("C:\Users\user\Desktop\Test.ods")
End Sub
所以该函数将为您 return 一个字符串。考虑如何处理这一行(或查看第 7 章 Writer 宏)
要获取PDF文档的文本部分,可以使用类似的技巧(从AcrobatReader复制内容到剪贴板,只取出复制的文本部分)或者在Draw中打开并遍历所有图形元素,以便从中获取文本片段。
我正在尝试创建一个 libreoffice 基本宏,它允许您将文件的全部内容复制到 table 行中。下面的代码可以正确处理文本文件,例如 .ods 或 .txt,但在处理 .pdf 和 .ods 文件时会出现一些问题。特别是它在 getText() 方法上崩溃。 您知道我可以使用什么其他方法来解决我的问题吗?
`
REM ***The file Url***
sUrlDoc = "file:///C:/Users/user/Desktop/Test.ods"
REM ***It correctly opens the file***
oDoc = StarDesktop.loadComponentFromURL(sUrlDoc, "_blank", 0, Prop() )
REM ***Correctly inserts a new row in the table***
oTable.Rows.insertByIndex(oTable.getRows().getCount(),1)
REM ***It goes into the right position***
oCell = oTable.getCellByPosition(0,1)
REM ***Should read from file (only works with .odt and .txt)***
oCursor = oDoc.getText(1)
oCell.setString(oCursor.string)
oDoc.close(true)`
您可以通过多种方式获取 ODS 文件的上下文。
其中最慢的是将工作簿sheet中的所有数据逐个单元格地sheet迭代,取出每个单元格的文本内容。
我建议使用 Andrew Pitonyak shows in chapter 5.23. Manipulating the clipboard 的方法(将这本书放在手边,您将不必编写许多宏来解决日常任务 - 您只需 ready-made代码)
Function getContentODS(sDocName As String) As String
Dim oDoc As Variant ' Spreadsheet as object
Dim bDisposable As Boolean ' Can be closed
Dim oSheets As Variant ' All sheets of oDoc
Dim oSheet As Variant ' Single sheet
Dim i As Long
Dim oCurrentController As Variant
Dim oCursor As Variant ' Get Used Area
Dim oTransferable As Variant ' Content of selection
Dim oTransferDataFlavors As Variant
Dim oConverter As Variant ' Util
Dim j As Integer, iTextLocation As Integer
Dim oData As Variant
Dim sResult As String ' All content as very long string
GlobalScope.BasicLibraries.loadLibrary("Tools")
If Not FileExists(sDocName) Then Exit Function
oDoc = OpenDocument(ConvertToURL(sDocName), Array(), bDisposable)
sResult = FileNameoutofPath(sDocName) & ": "
oCurrentController = oDoc.getCurrentController()
oSheets = oDoc.getSheets()
oConverter = createUnoService("com.sun.star.script.Converter")
For i = 0 to oSheets.getCount()-1
oSheet = oSheets.getByIndex(i)
oCursor = oSheet.createCursor()
oCursor.gotoEndOfUsedArea(True)
oCurrentController.select(oCursor)
oTransferable = oCurrentController.getTransferable()
oTransferDataFlavors = oTransferable.getTransferDataFlavors()
iTextLocation = -1
For j = LBound(oTransferDataFlavors) To UBound(oTransferDataFlavors)
If oTransferDataFlavors(j).MimeType = "text/plain;charset=utf-16" Then
iTextLocation = j
Exit For
End If
Next
If (iTextLocation >= 0) Then
oData = oTransferable.getTransferData(oTransferDataFlavors(iTextLocation))
sResult = sResult & oSheet.getName() & "=" & _
oConverter.convertToSimpleType(oData, com.sun.star.uno.TypeClass.STRING) & "; "
End If
Next i
If bDisposable Then oDoc.close(True)
getContentODS = sResult
End Function
这个函数会打开spreadsheet,它会在参数中接收到它的路径和名字,遍历所有的sheet,取出文本内容拼接成一个长字符串变量,最后关闭文档。
您可以使用以下过程测试此代码:
Sub tst
MsgBox getContentODS("C:\Users\user\Desktop\Test.ods")
End Sub
所以该函数将为您 return 一个字符串。考虑如何处理这一行(或查看第 7 章 Writer 宏)
要获取PDF文档的文本部分,可以使用类似的技巧(从AcrobatReader复制内容到剪贴板,只取出复制的文本部分)或者在Draw中打开并遍历所有图形元素,以便从中获取文本片段。