有没有办法将 .ods 和 .pdf 文件的文本复制到 libreoffice .odt 文件中?

Is there a way to copy the text of an .ods and .pdf file into a libreoffice .odt file?

我正在尝试创建一个 libreoffice 基本宏,它允许您将文件的全部内容复制到 table 行中。下面的代码可以正确处理文本文件,例如 .ods 或 .txt,但在处理 .pdf 和 .ods 文件时会出现一些问题。特别是它在 getText() 方法上崩溃。 您知道我可以使用什么其他方法来解决我的问题吗?

`

REM ***The file Url***
sUrlDoc = "file:///C:/Users/user/Desktop/Test.ods"

REM ***It correctly opens the file***
oDoc = StarDesktop.loadComponentFromURL(sUrlDoc, "_blank", 0, Prop() )

REM ***Correctly inserts a new row in the table***
oTable.Rows.insertByIndex(oTable.getRows().getCount(),1)

REM ***It goes into the right position***
oCell = oTable.getCellByPosition(0,1)

REM ***Should read from file (only works with .odt and .txt)***
oCursor = oDoc.getText(1)
oCell.setString(oCursor.string)

oDoc.close(true)`

您可以通过多种方式获取 ODS 文件的上下文。

其中最慢的是将工作簿sheet中的所有数据逐个单元格地sheet迭代,取出每个单元格的文本内容。

我建议使用 Andrew Pitonyak shows in chapter 5.23. Manipulating the clipboard 的方法(将这本书放在手边,您将不必编写许多宏来解决日常任务 - 您只需 ready-made代码)

Function getContentODS(sDocName As String) As String 
Dim oDoc As Variant         ' Spreadsheet as object
Dim bDisposable As Boolean  ' Can be closed
Dim oSheets As Variant      ' All sheets of oDoc
Dim oSheet As Variant       ' Single sheet
Dim i As Long           
Dim oCurrentController As Variant
Dim oCursor As Variant      ' Get Used Area
Dim oTransferable As Variant    ' Content of selection
Dim oTransferDataFlavors As Variant
Dim oConverter As Variant   ' Util
Dim j As Integer, iTextLocation As Integer
Dim oData As Variant
Dim sResult As String       ' All content as very long string
    GlobalScope.BasicLibraries.loadLibrary("Tools")
    If Not FileExists(sDocName) Then Exit Function 
    oDoc = OpenDocument(ConvertToURL(sDocName), Array(), bDisposable)
    sResult = FileNameoutofPath(sDocName) & ": "
    oCurrentController = oDoc.getCurrentController()
    oSheets = oDoc.getSheets()
    oConverter = createUnoService("com.sun.star.script.Converter")
    For i = 0 to oSheets.getCount()-1
        oSheet = oSheets.getByIndex(i)
        oCursor = oSheet.createCursor()
        oCursor.gotoEndOfUsedArea(True)
        oCurrentController.select(oCursor)
        oTransferable = oCurrentController.getTransferable()
        oTransferDataFlavors = oTransferable.getTransferDataFlavors()
        iTextLocation = -1
        For j = LBound(oTransferDataFlavors) To UBound(oTransferDataFlavors)
            If oTransferDataFlavors(j).MimeType = "text/plain;charset=utf-16" Then
                iTextLocation = j
                Exit For
            End If
        Next
        If (iTextLocation >= 0) Then
            oData = oTransferable.getTransferData(oTransferDataFlavors(iTextLocation))
            sResult = sResult & oSheet.getName() & "=" & _
                oConverter.convertToSimpleType(oData, com.sun.star.uno.TypeClass.STRING) & "; "
        End If
    Next i
    If bDisposable Then oDoc.close(True)
    getContentODS = sResult
End Function

这个函数会打开spreadsheet,它会在参数中接收到它的路径和名字,遍历所有的sheet,取出文本内容拼接成一个长字符串变量,最后关闭文档。

您可以使用以下过程测试此代码:

Sub tst
    MsgBox getContentODS("C:\Users\user\Desktop\Test.ods")
End Sub

所以该函数将为您 return 一个字符串。考虑如何处理这一行(或查看第 7 章 Writer 宏)

要获取PDF文档的文本部分,可以使用类似的技巧(从AcrobatReader复制内容到剪贴板,只取出复制的文本部分)或者在Draw中打开并遍历所有图形元素,以便从中获取文本片段。