从电子邮件中以 html 读取嵌入的内联图像

Read embedded inline image from email as html

我尝试使用 IMAP 连接阅读电子邮件。我收到的电子邮件内容为 html。当我收到一封正文包含图像的电子邮件时。我无法从电子邮件正文中获取图像。

html 输出如下。

<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
    {font-family:Cambria;
    panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
    {font-family:Calibri;
    panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
    {font-family:Tahoma;
    panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
    {margin:0in;
    margin-bottom:.0001pt;
    font-size:11.0pt;
    font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
    {mso-style-priority:99;
    color:blue;
    text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
    {mso-style-priority:99;
    color:purple;
    text-decoration:underline;}
p.MsoAcetate, li.MsoAcetate, div.MsoAcetate
    {mso-style-priority:99;
    mso-style-link:"Balloon Text Char";
    margin:0in;
    margin-bottom:.0001pt;
    font-size:8.0pt;
    font-family:"Tahoma","sans-serif";}
span.EmailStyle17
    {mso-style-type:personal-compose;
    font-family:"Calibri","sans-serif";
    color:windowtext;}
span.BalloonTextChar
    {mso-style-name:"Balloon Text Char";
    mso-style-priority:99;
    mso-style-link:"Balloon Text";
    font-family:"Tahoma","sans-serif";}
.MsoChpDefault
    {mso-style-type:export-only;
    font-family:"Calibri","sans-serif";}
@page WordSection1
    {size:8.5in 11.0in;
    margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
    {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal">The body parts <o:p></o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal"><i><span lang="EN-GB" style="font-size:10.0pt;font-family:&quot;Cambria&quot;,&quot;serif&quot;;color:#365F91">Regards</span></i><i><span style="font-size:10.0pt;font-family:&quot;Cambria&quot;,&quot;serif&quot;;color:#365F91">,</span></i><span lang="EN-IN" style="color:#365F91"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-GB" style="color:#365F91">&nbsp;</span><span lang="EN-IN" style="color:#365F91"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span style="font-size:10.0pt;font-family:&quot;Cambria&quot;,&quot;serif&quot;;color:#17365D">Amith K&nbsp; Bharathan</span></i><span lang="EN-IN" style="color:#17365D"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span style="font-size:10.0pt;font-family:&quot;Cambria&quot;,&quot;serif&quot;;color:#17365D">Software Engineer</span></i><span lang="EN-IN" style="color:#17365D"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span style="font-size:10.0pt;font-family:&quot;Cambria&quot;,&quot;serif&quot;;color:#17365D">&nbsp;</span></i><span lang="EN-IN" style="color:#17365D"><o:p></o:p></span></p>

<p class="MsoNormal">
<span style="font-size:10.0pt;font-family:&quot;Cambria&quot;,&quot;serif&quot;;color:#17365D">


**<img width="110" height="61" id="Picture_x0020_1" src="cid:image001.png@01D190D9.38FE7C00" alt="Description: Description: Description: Description: tstlogo">**


</span><span lang="EN-IN" style="color:#17365D"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span style="font-size:10.0pt;font-family:&quot;Cambria&quot;,&quot;serif&quot;;color:#17365D">TTT Software &amp; Systems India Pvt. Ltd.</span></i><span lang="EN-IN" style="color:#17365D"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span style="font-size:10.0pt;font-family:&quot;Cambria&quot;,&quot;serif&quot;;color:#17365D"> Infopark, Kakkanad-682 030</span></i><span lang="EN-IN" style="color:#17365D"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span style="font-size:10.0pt;font-family:&quot;Cambria&quot;,&quot;serif&quot;;color:#17365D">Mob&nbsp;&nbsp; :</span></i></b><i><span style="font-size:10.0pt;font-family:&quot;Cambria&quot;,&quot;serif&quot;;color:#17365D"> &#43;91 99957</span></i><span lang="EN-IN" style="color:#17365D"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span style="font-size:10.0pt;font-family:&quot;Cambria&quot;,&quot;serif&quot;;color:#17365D">Email :</span></i></b><i><span style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#17365D">
<a href="mailto:amith.bharathan@tst.co.in"><span style="color:#17365D">amith.bharathan@tst.co.in</span></a></span></i><span lang="EN-IN" style="color:#17365D"><o:p></o:p></span></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
</body>
</html>

我的代码片段是:

private String getTextFromMimeMultipart(
            MimeMultipart mimeMultipart) throws Exception{
        String result = "";
        int count = mimeMultipart.getCount();
        System.out.println("____________START______GET MULTI PART"+count);
        for (int i = 0; i < count; i++) {
            BodyPart bodyPart = mimeMultipart.getBodyPart(i);
            if (bodyPart.isMimeType("text/plain")) {
                System.out.println("11111111111111");
                result = result + "\n" + bodyPart.getContent();
                //System.out.println("RESULT "+result);
             //   break; // without break same text appears twice in my tests
            }   if (bodyPart.isMimeType("text/html")) {
                System.out.println("2222222");
                String html = (String) bodyPart.getContent();
                System.out.println("22 bodypart "+bodyPart.getContentType());

                result = result + "\n >>> " + org.jsoup.Jsoup.parse(html).text();
            }  if (bodyPart.getContent() instanceof MimeMultipart){

                result = result + getTextFromMimeMultipart((MimeMultipart)bodyPart.getContent());
                System.out.println("3333333333333"+(MimeMultipart)bodyPart.getContent());

                Multipart multiPart = (MimeMultipart)bodyPart.getContent();
                System.out.println("multipart COUNT "+multiPart.getCount());
                for (int v = 0; v < multiPart.getCount(); v++) {
                    MimeBodyPart part = (MimeBodyPart) multiPart.getBodyPart(v);
                    System.out.println("PART ENCODING 111"+part.getEncoding()+"DISPOSITION "+part.getDisposition());
                  //  downloadFile("fl"+v, part) ;
                    if (Part.ATTACHMENT.equalsIgnoreCase(part.getDisposition())) {

                        downloadFile("fl"+v, part) ;
                    } if (Part.INLINE.equalsIgnoreCase(part.getDisposition())) {

                     System.out.println("_________________INLINE___________");
                }
                    if(part.getDisposition() == null){
                        System.out.println("INLINE FILE NAME "+part.getFileName());

                        //downloadFile("fl"+v, part) ;
                    }
                }

            }
        }
        System.out.println("____________END___________"+result);
        return result;
    }

O/P:

____________START______GET 多部分 2

2222222

22 bodypart text/html; charset=us-ascii ____________END___________

  The body parts   Regards,   Amith K  Bharathan Software Engineer   TST Software & Systems India Pvt. Ltd. Infopark, Kakkanad-682 030 Mob   : +91 99947 Email : amith.bharathan@tst.co.in     MULTI PART 2

INLINE IMAGE FILLLLL NAMEEEE null I/P com.sun.mail.imap.IMAPInputStream@75b3adecMIME

如果您收到多部分消息,请仔细检查各个部分。您处理 text/plain 部分、text/html 部分和多部分。如果顶级消息的一部分是多部分,则您在该子多部分中查找图像。但是您永远不会在顶级多部分中寻找图像。将 "else" 子句添加到顶级 "if" 语句,您将看到缺少的内容。

您对 MIME 邮件的结构做出了一些通常不正确的假设。