I have a xml documents to be indexed. But it is available in two different formats.

Sample xml Document Format 1:

<?xml version="1.0" encoding="UTF-8"?>

<![CDATA[Some Data Here...]]>

if i indexed the document of this format then elasticsearch is successfully detect the content type as "application/xml" and also give me document on search.

Sample xml Document Format 2:

Even if i have same data but present in following format of document then elasticsearch is not detecting the content type as well as not giving any document on search.

<?xml version="1.0" encoding="UTF-8"?>

<![CDATA[<div><p>Some Data Here...</p>
<p><span style="font-family:Mangal" lang="HI">Some Data Here...</scan></p>]]>


Kindly provide some clarity on the above mentioned scenarios.